Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

Approaches to achieve high-level heterologous protein production in plants

Approaches to achieve high-level heterologous protein production in plants <h1>Introduction</h1> Genetically engineered microorganisms are a very important source of industrial and medicinal proteins. Over the last 30 years, the economic and safety advantages of recombinant microbial systems have allowed them to replace many native sources of proteins ( Cronin, 1997 ; Frank, 1998 ). Recombinant microbes have also enabled the production of alternative engineered protein forms with modified activities and properties ( Crabb and Bolin, 1999 ). However, microbes have certain limitations in the classes of proteins that can be economically produced, and in the post-translational processing that can be achieved. Thus, insect and mammalian cell cultures have also been utilized for eukaryotic protein production ( Lubiniecki and Lupker, 1994 ; Kost and Condreay, 1999 ), and transgenic animals ( Houdebine, 2002 ) are also being evaluated. However, with these systems, in particular cell cultures, production costs are prohibitively high for many proteins. Over the past decade plants have been developed as recombinant protein production systems ( Howard and Hood, 2005 ). Potentially, they have great advantages over microorganisms, and especially animal cell systems, for the costs of production ( Daniell et al ., 2001a ). Moreover, unlike bacteria, they are capable of eukaryotic post-translational modifications, most importantly glycosylation ( Hood et al ., 1997 ; Karnoup et al ., 2005 ). This can be achieved without the hyperglycosylation that is often observed with certain yeast systems ( Nakamura et al ., 1993 ; Montesino et al ., 1998 ). Plants are particularly well suited to the production of edible oral vaccines ( Streatfield and Howard, 2003 ) and industrial enzymes for biomass treatment ( Xue et al ., 2003 ; Teymouri et al ., 2004 ). Protein purification is not required and, in the case of industrial enzymes, volumes are anticipated to be very large. Several plant systems have been evaluated for the economic production of recombinant proteins. These can be categorized by the approach used to generate the production vehicle and by the type of tissue in which recombinant protein is produced ( Table 1 ). Three broad approaches have been followed to make heterologous proteins in plants. The most common approach, and one that is applicable to a wide range of species, has been to use biolistics or agrobacteria to generate stably transformed plants ( Davey et al ., 1989 ; Gelvin, 2003 ). Recombinant gene sequences are integrated into the host plant's nuclear genome, and the approach can be routinely applied, even to multisubunit proteins, such as antibodies ( Hiatt et al ., 1989 ). Where possible, agrobacterial infection is generally preferred, as the resulting transgenic plants are less subject to silencing ( De Wilde et al ., 2000 ). A second approach has been to use biolistics to generate transplastomic plants in which the plastid genome is modified to incorporate recombinant sequences ( Svab and Maliga, 1993 ). Homologous recombination targets the transgene to a specific locus, and transplastomic plants are not subject to silencing. Until recently, very few tissue types and species were amenable to this technology, but it has now been extended to more important crops ( Daniell et al ., 2005 ). Mitochondrial transformation is even more limited, with no reliable methodology yet developed, and so the production of recombinant proteins in plant mitochondria has not been pursued. As a third approach to produce foreign proteins in plants, some species have been inoculated with recombinant plant viral genomes. The engineered viral sequences are transiently expressed in infected plant tissues, usually leaves ( Porta and Lomonossoff, 2002 ; Canizares et al ., 2005 ). Transient recombinant protein production can also be accomplished using biolistics ( Godon et al ., 1993 ) or agrobacteria ( Hellens et al ., 2005 ), but, in these cases, expression tends to be impractically low. Depending on which of these approaches is followed to introduce recombinant sequences into plants, and depending on the species of plant utilized, foreign proteins can be produced in various plant tissue types and can even be secreted into surrounding media, greatly simplifying purification ( Borisjuk et al ., 1999 ; Fischer et al ., 1999 ). Thus, the tissue type for expression can greatly influence the costs of protein production ( Kusnadi et al ., 1997 ; Horn et al ., 2004 ). Other factors, such as the options for post-translational processing, potential for contamination with harmful secondary metabolites and the confidence for biological containment, differ between alternative plant expression systems and target tissues. However, production costs are the key issue for recombinant protein production, and several factors affect these costs. These include the upfront costs of generating recombinant plant material, the costs of growing and harvesting plant material, and the downstream processing and purification costs. The latter tend to increase with tissue complexity. High levels of recombinant gene expression can greatly reduce the amounts of plant biomass required and can simplify processing and purification of the product. Thus, an overriding concern for all plant systems is the achievement of high expression ( Nandi et al ., 2005 ). Some protein products, such as industrial enzymes for applications such as bleaching and edible vaccines for delivery to animals, may require only limited processing of plant material ( Hood et al ., 2003 ; Lamphear et al ., 2004 ). However, these products are relatively inexpensive, and so must be produced at high levels in plants to ensure economic viability. At the other extreme, more expensive products, such as pharmaceuticals for administration to humans and enzymes for pharmaceutical processing, require protein extraction and purification, greatly adding to the production costs ( Samyn-Petit et al ., 2001 ; Woodard et al ., 2003 ). These costs can be minimized through increasing expression. Thus, high-level expression is essential for economic recombinant protein production. In addition, for protein products that are not extracted and purified prior to use, expression must be sufficiently high to ensure efficacy. For example, for an edible vaccine, a sufficient dose of antigen to confer protection must be delivered in a quantity of plant tissue that can be practically ingested at a single sitting ( Lamphear et al ., 2004 ; Tacket et al ., 2004 ). Strategies have been developed to increase the levels of recombinant proteins for plant production systems. These focus on the synthesis and stability of recombinant nucleic acids and encoded proteins. The type of system used for protein production specifies the range of approaches that can be applied. With stable nuclear transgenics, fairly high levels of expression have been achieved for some recombinant proteins ( Hood et al ., 1997 ; Streatfield et al ., 2002 ), and there are many approaches available to boost expression. With transplastomic expression, very high levels of a few recombinant proteins have been achieved ( Daniell et al ., 2001b ; Kumar and Daniell, 2004 ), presumably due in part to the presence of up to 100 genomes per plastid and up to 100 plastids per cell. However, options to further boost expression are currently more limited than with nuclear transgenics. With transient viral systems, fairly high levels of expression have been achieved ( Kumagai et al ., 1993 ), and tools are available to further boost expression. In this article, the various approaches used to boost recombinant protein production levels in plants are reviewed ( Table 2 ). <h1>Molecular approaches to boost expression</h1> Several molecular approaches have been applied to boost heterologous gene expression. These target replication, transcription, transcript stability and translation. <h2>Boosting the replication of transiently transformed sequences</h2> With transiently expressing plant viral systems, recombinant proteins are encoded on an engineered viral vector. Most attention has focused on the use of positive-sense RNA viruses, in which, in addition to replication, the inoculating genome can be directly translated. In vitro -synthesized transcripts derived from cDNA clones of viral vectors are inoculated on to susceptible plant tissues. Foreign genes are generally inserted as additional reading frames into viral genomes rather than replacing native genes ( Canizares et al ., 2005 ). This overcomes the problems observed in maintaining and expressing viral sequences when native genes are deleted. However, with many viral systems, there appear to be size constraints on the foreign protein expressed. Cis -acting elements can have a major effect on expression. For example, viral heterologous sequences can influence the expression of reporter genes ( Shivprasad et al ., 1999 ). Sequences encoding proteins required for viral replication are generally present on the inoculating transcripts. However, these replication functions can alternatively be supplied in trans from sequences stably integrated into the nuclear genome of the host plant ( Taschner et al ., 1991 ; Sanchez-Navarro et al ., 2001 ; Yusibov et al ., 2002 ). This approach allows plants to be developed that constitutively express sufficiently high levels of the viral proteins required for replication, and has the added benefit of conferring a degree of biological containment on the inoculating viral sequences. Post-transcriptional gene silencing can be avoided by expressing replicase functions from inducible promoters ( Mori et al ., 2001 ). As an extension of this approach to incorporate viral sequences into the host genome, efforts are underway to use viral replication machinery to boost expression from viral cassettes integrated into the genomes of transgenic plants. The aim is for sequences encoding viral replicases and sequences encoding viral vector-expressed foreign proteins to be integrated into the host plant genome, allowing RNA transcribed from the integrated viral vector to be replicated to high levels. Similarly, a geminivirus expression system has been engineered that avoids repeated infection cycles each time recombinant protein is to be produced. In this case, viral sequences encoding foreign proteins have been co-transformed with a vector directing the constitutive expression of viral replicases by biolistic bombardment of a tobacco cell line ( Hefferon and Fan, 2004 ). High-level replication can be transiently achieved and foreign protein expression enhanced. Ongoing work with this system is focused on the development of stably transformed lines with the replicase functions under inducible control and encoded by the plant genome, and the recombinant viral sequences for foreign protein expression also inserted into the genome but subject to excision and replication. As an alternative to the engineering of the genome of the host plant to provide viral replication functions, viral replicons can be delivered through infection with Agrobacterium , a process termed ‘magnifection’ ( Gleba et al ., 2005 ). Using this system, foreign protein can be transiently expressed at up to 80% of total soluble protein. <h2>Boosting transcription</h2> Increasing the levels of transcription of stably transformed sequences has probably received the most attention of the various approaches to boost expression. Transcription is an early step in the process of generating recombinant protein, and thus improvements here are viewed as having great potential to increase yield. Several promoters have been tested, of both plant pathogen and plant origin. In addition, some synthetic promoters have been developed. As tobacco and Arabidopsis are generally relatively easily transformed, and leaf tissues provide most of the biomass with these species, promoters that drive relatively high levels of expression in dicot leaves were the tools of choice for much early phase research. The promoters tested included those of plant viruses and highly expressed dicot leaf genes. The cauliflower mosaic virus 35S promoter ( Odell et al ., 1985 ) was the mainstay of early phase work focused on expressing recombinant proteins in dicot leaves. Using this promoter, fairly high levels of accumulation of recombinant proteins have been achieved ( Fiedler et al ., 1997 ; Gutierrez-Ortega et al ., 2005 ). Other plant viral promoters have also been tested, including a cassava vein mosaic virus promoter ( Verdaguer et al ., 1996 ), the C1 promoter of cotton leaf curl Multan virus ( Xie et al ., 2003 ) and the promoter of component 8 of Milk vetch dwarf virus ( Shirasawa-Seo et al ., 2005 ). Most of the viral promoters tested also show activity in monocots ( Shirasawa-Seo et al ., 2005 ). Some plant viral promoters show tissue specificity ( Mazithulela et al ., 2000 ). However, the strongest expressing plant viral promoters are generally not tissue specific ( Odell et al ., 1985 ), and the expression of recombinant proteins in tissues other than the target tissue can be problematical for plant health and can make biological containment difficult. Agrobacterium tumefaciens promoters have also been used to express recombinant proteins in plants. These include the nopaline synthase ( Shaw et al ., 1984 ) and mannopine synthetase promoters. The nopaline synthase promoter is less active than the cauliflower mosaic virus 35S promoter ( Sanders et al ., 1987 ; Harpster et al ., 1988 ), but the mannopine synthetase promoter shows comparable or increased activity ( Stefanov et al ., 1991 ). Leaf-specific expression has been achieved using plant leaf promoters, such as that of the small subunit of ribulose-bisphosphate carboxylase, giving expression levels in excess of 1% of total soluble protein ( Dai et al ., 2000 ). Dicot seed promoters have also been assessed, and very high levels of foreign gene expression have been achieved using the arcelin 5-I promoter of common bean ( De Jaeger et al ., 2002 ). Using this system, an antibody single-chain variable fragment accumulated to approximately 36% of total soluble protein. As a result of the relative difficulty of transformation, less work has focused on monocot systems, even though the seed tissues of cereals are very well suited to protein accumulation and storage. However, several groups have pursued monocot systems, often using high-expressing plant promoters, such as the maize polyubiquitin-1 promoter ( Christensen et al ., 1992 ). Engineered versions of this promoter ( Streatfield et al ., 2004 ) have achieved fairly high levels of expression in seeds. However, polyubiquitin-1 is a constitutive promoter, and therefore several other seed-specific monocot promoters have also been assessed, including the embryo-specific maize globulin-1 promoter ( Belanger and Kriz, 1991 ) and the endosperm-specific maize 27-kDa zein and barley D hordein promoters ( Russell and Fromm, 1997 ; Horvath et al ., 2000 ). The expression levels achieved in the seed tissues of monocots using seed-specific promoters have exceeded those using constitutive promoters ( Hood et al ., 2003 ). Transplastomic expression requires the use of promoters that are active in the plastid, which is somewhat akin to a prokaryotic expression system. The promoters tested here are of highly expressed plastid genome sequences, including the constitutive promoter of the rRNA operon. Very high levels of expression have been achieved using this promoter ( Daniell et al ., 2001b ; Tregoning et al ., 2003 ). However, the high expression level here is primarily considered to be a consequence of having up to 10 000 plastid genomes per leaf cell. Foreign genes inserted into plant viral vectors for transient expression are under the transcriptional control of native plant viral promoters. As an alternative, foreign genes may be placed under the control of a viral subgenomic promoter. A subgenomic promoter can be duplicated to increase expression ( Chapman et al ., 1992 ). In addition, separate subgenomic promoters can be used for the foreign gene and the viral coat protein. In this way, deletion of sequences as a result of recombination between identical regulatory elements can be avoided ( Donson et al ., 1991 ). Several strategies have been followed to further boost transcription over that achieved with plant or plant viral promoters. Multiple copies of enhancer sequences from a highly active promoter, such as cauliflower mosaic virus 35S, can be stacked to boost expression ( Guerineau et al ., 1992 ). Further synthetic promoters have been developed that combine the most active sequences of multiple well-characterized natural promoters. For example, elements of the cauliflower mosaic virus 35S promoter and the Agrobacterium Ti plasmid mannopine synthetase promoter have been combined. In this case, the synthetic promoter is several-fold more active than either component ( Comai et al ., 1990 ). Similar results have been achieved with chimeric plant viral promoters ( Rance et al ., 2002 ). Another approach to boost transcript levels has been to stack transcription units. These units can be driven by repeats of the same promoter and terminator or, preferably, by different controlling elements to reduce the potential for recombination or silencing. The stacking of transcription units typically increases expression over the use of a single unit, and allows for multiple proteins to be expressed ( During et al ., 1990 ). However, transcriptional interference usually results in promoter activities being less than fully additive ( Thompson and Myatt, 1997 ). The orientation of the transcription units relative to one another affects the degree of interference observed, and interference can be eliminated using transcriptional blocker sequences between the units ( Padidam and Cao, 2001 ). An alternative way to express multiple copies of the same gene is to use synthetic bidirectional promoters with enhancer elements flanked on either side by core promoter elements ( Xie et al ., 2001 ; Li et al ., 2004 ). A further way to increase transcriptional activity is to include global regulatory sequences next to promoters on expression cassettes. These scaffold attachment or matrix attachment regions have been characterized as interacting with plant nuclear scaffolds in vitro , and are considered to place surrounding loci in locations suitable for the recruitment of transcription factors to promoters ( Spiker and Thompson, 1996 ). These global sequences can boost the expression of plant genes by an order of magnitude ( Allen et al ., 1993 ; Li et al ., 2002 ). In addition to approaches to boost transcription that focus on cis -acting elements, other strategies have been followed that utilize trans -acting factors. Transcription factors capable of activating expression from chosen cis -acting promoter sequences are co-expressed in transgenic plants ( Yang et al ., 2001 ). The transcription factors may bind directly to the promoters of the recombinant genes, or may interact with other factors binding these promoters. However, for coexpressed factors to have a positive effect on transcription, native levels of these factors must be limiting in the host plant. The expression unit encoding a transcription factor can be co-transformed together with the transcription unit for recombinant protein expression, an approach that has boosted expression by two- to fourfold ( Yang et al ., 2001 ). Alternatively, a transgenic line can be developed expressing elevated levels of a transcription factor, and this line can be transformed with a recombinant protein expression unit. As a further alternative, separate lines expressing a transcription factor and a recombinant expression unit can be crossed. A transcription factor can also be provided transiently by the inoculation of a transgenic line with plant viral sequences encoding the factor ( Hull et al ., 2005 ). Moreover, by expressing a transcription factor from a promoter that the factor can itself bind to, very high levels of the transacting factor can potentially be achieved through positive feedback ( Schwechheimer et al ., 2000 ). In addition to stable nuclear transgenic systems, the co-expression of trans -acting factors has been applied to transplastomic plants. For example, T7 RNA polymerase has been expressed from the nuclear genome and then targeted to the plastid, where it can transcribe sequences integrated into the plastid genome ( McBride et al ., 1994 ). <h2>Stabilizing the message and ensuring correct message processing</h2> As a recombinant message is transcribed, the stability of that message is important to ensure high-level expression. The non-translated region located downstream of the translation stop codon is critical for processing and should include signals targeting the message for polyadenylation. Message destabilizing sequences in the downstream region can greatly affect stability ( Green, 1993 ; Newman et al ., 1993 ), and these sequences must be avoided when preparing gene constructs to express high levels of recombinant proteins. Several plant viral and plant 3′ untranslated regions have been utilized to process messages for recombinant proteins without destabilizing these messages. They include the cauliflower mosaic virus 35S terminator and the potato proteinase inhibitor II terminator ( Hood et al ., 1997 ). For plant-based expression systems, eukaryotic genes are generally re-synthesized or engineered to remove introns. Sequences such as consensus intron splice sites, message destabilizing sequences and transcript termination sequences should be avoided in constructing these mini-genes. However, some plant intron sequences contribute positively to the expression level observed for their native genes, and these sequences can boost expression if inserted as synthetic introns into genes ( Fiume et al ., 2004 ). <h2>Boosting translation</h2> There are several approaches to increase translational activity for recombinant sequences. Several plant and plant viral 5′ non-translated regions have been used with the aim of increasing the rate of translation initiation. For example, the tobacco mosaic virus and potato virus X leader sequences boost recombinant protein levels ( Gallie et al ., 1987 ; Pooggin and Skryabin, 1992 ). In order to optimize translation, any sequences located immediately around the translation start site should be modified to fit the consensus initiation sequence, which varies between plant species ( Joshi et al ., 1997 ). Many of the genes expressed in recombinant plant systems, particularly those expressed by industrial groups, are synthesized de novo from oligonucleotides. This allows for fully codon-optimized genes to be designed to suit the expression host. Experiments with tobacco-expressed green fluorescent protein have demonstrated the benefit of codon optimization in plants ( Rouwendal et al ., 1997 ). Preferred codon usage differs for monocots vs. dicots, and is greatly different for genes expressed from plastid genomes. Codon optimization within approximately 40 amino acids of the N-terminus is particularly important for increasing recombinant protein production ( Batard et al ., 2000 ). Engineering the sequence here allows for most of the benefits of codon optimization, whilst minimizing gene construction costs. As another alternative to synthesizing whole genes, specific rare codons can be removed by site-specific mutagenesis. However, when optimizing a gene, extensive runs or localized concentrations of a specific codon should be avoided, so that the corresponding tRNA does not become rate limiting. Extensive predicted mRNA secondary structures that might hinder translation should also be avoided, as should internal ribosome entry sites that might compete for translation of the full message. Polycistronic messages offer an alternative to multiple transcription units for the expression of more than one recombinant protein in transgenic plants. Co-expression of two reporter genes together with the cauliflower mosaic virus translational activator allowed for comparable levels of each foreign protein to accumulate ( Futterer and Hohn, 1991 ). The expression of multiple foreign genes as polycistronic messages has also been achieved in plastids ( Staub and Maliga, 1995 ; Quesada-Vargas et al ., 2005 ). <h1>Genetic approaches to boost expression</h1> Several genetic approaches have been applied to boost expression, including increasing the transgene copy number through selfing or crossing, and introducing foreign genes into germplasm suited to their over-expression. <h2>Increasing transgene copy number</h2> With stable transgenics, taking a single transgene to homozygosity through selfing doubles transgene representation and typically boosts expression ( Zhong et al ., 1999 ). Moreover, crossing high-expressing transgenic lines arising from independent transformation events can boost transgene copy number and expression levels. This approach to achieve high gene copy numbers appears to have a reduced risk of gene silencing in comparison with making multiple gene insertions during transformation. <h2>Selection of germplasm</h2> Certain germplasm is well suited to the expression of proteins at high levels. For example, maize lines have been developed over many generations with elevated protein or oil levels in seeds ( Bhattramakki and Kriz, 1996 ; Ting et al ., 1996 ). High-protein lines have the potential for the production of more recombinant protein in the seed on a weight basis. In high-oil lines, the increased proportion of oil is primarily a result of increased embryo size. Therefore, the level of recombinant proteins expressed in the embryo is also increased on a seed weight basis. As maize embryos have a much higher proportion of soluble to insoluble protein compared with endosperm, the use of high-oil lines can also increase recombinant protein expression on a percentage of total soluble protein basis. The use of high-oil lines has been shown to boost recombinant protein expression approximately fourfold in maize ( Hood et al ., 2003 ). Direct transformation of these lines may not be a viable strategy, as maize transformation is very inefficient for these lines. Rather, a transgene introduced into a maize line receptive to transformation can then be introgressed into the desired germplasm. Moreover, the introgression of a transgene into a high-yielding agronomic line can significantly boost foreign protein expression ( Hood et al ., 2003 ), in addition to the anticipated increase in yield on a per acre basis. Certain mutants and engineered transgenic lines may also be well suited to recombinant protein production. For example, a transgenic maize line expressing a cytokinin-synthesizing enzyme under the control of a senescence-inducible promoter develops two normal-sized embryos, one at the expense of endosperm tissue ( Young et al ., 2004 ). Thus, as with high-oil lines, this transgenic line has the potential for increased production of embryo-expressed recombinant proteins. <h1>Protein accumulation and stability</h1> Several approaches have been followed that focus on the stability of foreign proteins expressed in plants. These include avoiding amino acid sequences that target proteins for degradation, targeting proteins to an organellar environment suitable for accumulation, directing expression in a tissue type or at a time suited to protein accumulation, and making fusions to improve stability. <h2>Avoidance of sequences directing protein degradation</h2> Specific amino acid sequences identifying proteins for rapid turnover should be avoided when pursuing high-level expression. For example, a version of the N-end rule elucidated in microbial and mammalian systems, by which cytoplasmic or nuclear proteins with certain amino acids at their N-terminus are rapidly degraded, is also relevant to plants ( Vierstra, 1996 ). Amino acids that target degradation should be avoided at the N-terminus, or recombinant proteins should be targeted to alternative organelles. <h2>Subcellular targeting of foreign proteins</h2> Recombinant proteins encoded by transgenes integrated into the nuclear genome can be targeted to specific subcellular sites by including signal sequences on the expression constructs. In the absence of a signal sequence, the recombinant protein will accumulate in the cytoplasm, and this has been the most common means of expression pursued. However, signal sequences targeting recombinant protein to the apoplast, the plastid, the mitochondrion, vacuolar compartments, and the nucleus have also been used ( Hood et al ., 1997 ; Logan and Leaver, 2000 ; Streatfield et al ., 2003 ). Combining an apoplast targeting sequence with an endoplasmic reticulum retention sequence results in the recombinant protein being sequestered in the endoplasmic reticulum ( Streatfield et al ., 2003 ). For some organelles, including plastids and mitochondria, alternative signal sequences offer suborganellar options for targeting recombinant protein. In cases in which recombinant proteins have been targeted to alternative subcellular sites, considerable variation in protein accumulation has been observed. For example, the receptor binding subunit of the heat-labile toxin of Escherichia coli has been expressed in six alternative subcellular locations in maize seed, with expression varying across four orders of magnitude between organelles ( Streatfield et al ., 2003 ). Factors that may affect the degree of accumulation in different subcellular locations include the biochemical environment of the compartment, for example pH, and the available space in the compartment for protein storage. Space may be a constraint for a highly expressed recombinant protein. For example, the expression of hepatitis B surface antigen in soybean cell culture, with the recombinant protein localized to the endoplasmic reticulum, resulted in dilation of this membrane network, possibly indicating an accumulation limit being reached ( Smith et al ., 2002 ). Signal sequences specifying subcellular targets are not always removed from recombinant proteins. This may be a consequence of vector design, as with the C-terminal endoplasmic retention signal, which is not cleaved from native proteins that carry it. However, even if a signal sequence is removed from its native protein, it may not be cleaved when fused to a recombinant protein with a different amino acid sequence immediately adjoining the fusion site. For example, adding the signal sequence of soybean vegetative storage protein to the N-terminus of hepatitis B surface antigen for expression in tobacco cell culture resulted in the accumulation of the antigen with attached signal peptide in the endoplasmic reticulum ( Sojikul et al ., 2003 ). Non-removal or imprecise cleavage of signal sequences may affect recombinant protein activity and may raise safety concerns for products that must pass through a regulatory path for licensing. Moreover, the use of native signal sequences of recombinant proteins in plant expression systems can result in unexpected targeting, as seen with the receptor binding subunit of the heat-labile toxin of E. coli expressed from its native signal sequence in maize endosperm tissue ( Chikwamba et al ., 2003 ). Apoplast targeting was anticipated, but protein accumulated in starch granules, suggestive of amyloplast targeting. The selection of organelle targets can also affect the post-translational processing of recombinant proteins. For example, proteins targeted through the endoplasmic reticulum can be glycosylated. If a recombinant protein is most strongly expressed in an organelle in which it will be glycosylated, but glycosylation would be detrimental to activity, one option is to mutate the normally glycosylated sites to alternative amino acids. Differences between animal and plant glycosylation patterns are being addressed in some recombinant plant systems. Anti-sense potato and tobacco lines have been generated that do not add a plant-specific moiety ( Wenderoth and von Schaewen, 2000 ), and transgenic tobacco lines have been developed with the capacity to add mammalian moieties normally absent from plant proteins ( Bakker et al ., 2001 ). For recombinant proteins synthesized in the plastid from sequences integrated into the plastid genome, suborganellar targeting to particular locations within this organelle can be considered. Given their endosymbiotic origin, plastids may be particularly well suited to the production of bacterial proteins. Recombinant proteins are not glycosylated, but correct disulphide bonding can be accomplished ( Staub et al ., 2000 ). In addition, other post-translational modifications, such as lipidation, can be achieved, as for the outer surface lipoprotein A of Borrelia burgdorferi expressed from the tobacco plastid genome ( Glenz et al ., 2006 ). With regard to protein processing, plant viral expression systems are more akin to nuclear than plastid systems. They allow for targeting and for glycosylation of recombinant proteins ( Kumagai et al ., 2000 ; Dirnberger et al ., 2001 ). <h2>Tissue-specific, temporal and inducible expression</h2> The choice of tissue for expression can greatly influence the accumulation and stability of recombinant proteins. Factors such as protein content, water content and presence of proteases and protease inhibitors affect foreign protein accumulation. The choice of tissue may also be influenced by the anticipated product application. For example, an edible tissue is suitable for an oral vaccine. As discussed above, tissue-specific promoters allow for the expression of recombinant proteins only in preferred target tissues, such as tobacco leaves ( Dai et al ., 2000 ) and cereal seeds ( Hood et al ., 2003 ). The timing of expression can also affect protein accumulation and stability. For example, during the development of cereal seeds, most of the storage tissues are laid down just prior to desiccation. Expression from seed promoters active at this time will favour a similar accumulation of foreign proteins prior to desiccation, and this should favour protein stability. The expression of recombinant proteins at this late stage is also less likely to negatively affect plant health or embryo development. An alternative strategy, applicable to fresh tissues such as developing seedlings and leaves, is to harvest tissue expressing recombinant protein when accumulation of the protein is optimal ( Rodriguez, 1999 ; Daniell et al ., 2001a ). The use of inducible promoters allows for expression in a target tissue only following a specific treatment, such as chemical spraying ( Aoyama and Chua, 1997 ; Zuo and Chua, 2000 ; Padidam, 2003 ). In this case, protein accumulation can be restricted to a fine timeframe to limit detrimental effects on plant health. In addition, some regulatory sequences boost activity under certain growth conditions, such that the bulk of foreign protein accumulates in a brief period prior to harvest. For example, the plastid untranslated leader sequence of the 32-kDa D1 polypeptide of photosystem II enhances translational activity on switching growth conditions from a light/dark cycle to continuous light, and this has been utilized to boost yields from plants grown under controlled light conditions ( Fernandez-San Millan et al ., 2003 ; Koya et al ., 2005 ). <h2>Protein fusions</h2> Protein fusions can address issues of subcellular targeting, stability, purification and activity of product. Expression of a foreign protein in plants as a fusion to a plant protein present at high levels in a subcellular storage compartment within seed tissues allows for the foreign protein to be targeted to the same location as the native protein, where it can stably accumulate. For example, foreign proteins have been fused to oleosins for expression in Brassica napus seeds, where they accumulate in oil bodies ( van Rooijen and Moloney, 1995 ). Fusion of a foreign protein to the C-terminus of a single ubiquitin coding unit has been used to stabilize expressed recombinant proteins in several recombinant systems. In plants, this has increased recombinant protein levels 10-fold ( Garbarino et al ., 1995 ). The ubiquitin unit is subsequently removed in vivo . As a further alternative, polyproteins have been co-expressed in plants with a plant viral protease. Cleavage by the protease at a recognition site engineered between sequences encoding polypeptide chains yields distinct recombinant proteins ( Dasgupta et al ., 1998 ). These proteins can be targeted to different organelles. Fusion of a foreign protein or peptide to a second recombinant protein that has been shown to be stably expressed in plants can act to stabilize the target protein or peptide. The fusion of a tuberculosis antigen to the receptor binding subunit of the heat-labile toxin of E. coli follows this strategy ( Rigano et al ., 2004 ). An added advantage in this case is that the heat-labile toxin subunit can direct the fused antigen to ganglioside receptors on the surface of the gut, so facilitating delivery of the product to the target tissue. In addition, fusion of a foreign protein to a recombinant protein with defined binding characteristics or to an affinity tag can simplify purification. This has been applied in insect cells and other recombinant systems with fusions to molecules such as avidin ( Airenne et al ., 1999 ). However, later cleavage from the protein carrier or peptide tag may be necessary to release the desired product, so adding to the downstream processing costs. With plant viral expression systems, a peptide or protein is often expressed as a fusion to a viral coat protein ( Canizares et al ., 2005 ). Such fusions serve to stabilize the expressed protein or peptide, and also simplify purification, as plant viral particles can easily be separated from plant tissues. With plant-based vaccines the plant viral particles may also enhance the immunogenicity of the incorporated antigens. However, there is generally a strict size limit on the length of foreign sequence that can be fused without disrupting the assembly of the viral protein coat. Typically, only up to a few tens of amino acids can be fused to the coat protein ( Porta et al ., 2003 ). The isoelectric point of the sequence fused to the viral coat protein also affects the ability of the fusion to systemically infect plant tissues. With cowpea mosaic virus, the addition of a peptide with a high isoelectric point has a negative impact on the yield ( Porta et al ., 2003 ). A strategy to overcome the size limit on viral coat fusions, but still to obtain particle assembly, is to design a fusion with the 16-amino-acid 2A peptide of foot-and-mouth disease virus positioned between the foreign protein and the coat protein. A proportion of the fusion is co-translationally cleaved to yield coat protein, allowing for the assembly of particles containing both coat protein and uncleaved fusion molecules ( Cruz et al ., 1996 ; Smolenska et al ., 1998 ). Use of the 2A peptide has also been extended to stable transgenic plants to yield multiple proteins from a single transgene that can be independently targeted to different organelles ( El Amrani et al ., 2004 ). <h2>Expression of zymogens</h2> The expression of proteases in plants offers a particular challenge, as successful expression is likely to result in the degradation of plant proteins and poor plant health. Organellar targeting and tissue-specific expression may serve to limit negative effects, but an alternative strategy was followed to express commercial levels of bovine trypsin in maize ( Woodard et al ., 2003 ). In this case, expression was achieved without excessive proteolysis of plant proteins and poor plant health by producing the zymogen form of the protease. This strategy potentially calls for an additional proteolytic cleavage step during product purification, but, in the case of trypsinogen, the zymogen form of the enzyme was converted to active trypsin either in seed tissues or during protein extraction from seed ( Woodard et al ., 2003 ). <h1>Future prospects</h1> The expression of foreign proteins in plants with a view to commercial production has attracted considerable attention over the past decade. However, only a few small-scale products have so far reached the market. These are pure protein products that require cost-effective processing and purification schemes for commercial viability, and currently they are not produced economically on a large scale ( Hood et al ., 1997 ; Woodard et al ., 2003 ). Advances are required to significantly boost expression further, but, to date, many of the available tools discussed above have not been put together to optimize the expression of a single protein product. This stacking of approaches will probably be required to produce commercial products, although this can be limited by access to the necessary intellectual property. Moreover, when implementing new strategies and combining currently available approaches, care must be taken to minimize potential negative influences, in particular gene silencing ( De Wilde et al ., 2000 ). In the near term, enzymes for large-scale industrial processes and antigens for oral animal vaccines are the most likely plant-expressed products to be commercially viable. Purification is not necessary for these products, and regulatory paths are less stringent than for pharmaceuticals ( Streatfield, 2005 ). With edible vaccines, expression levels of some antigens are already sufficient for economic products, provided that promising early-stage efficacy data are confirmed in later trials ( Lamphear et al ., 2004 ). With industrial enzymes, further increases in expression will be necessary. Much more attention will also need to be applied to technical hurdles to ensure correct post-translational processing and protein stability in plant tissues. Post-translational processing is likely to be particularly important for pharmaceuticals, for which safety and efficacy concerns over issues such as plant-specific glycosylation may arise. Moreover, recombinant protein stability will be affected by host proteases. For recombinant protein production in microbes, host lines have been developed lacking proteases ( Meerman and Georgiou, 1994 ). A similar approach with plants, achieved through gene knock-out or silencing, should serve to increase recombinant protein stability. Foreign protein expression in plants has largely focused on the expression of one or, in a few cases such as antibodies, a few molecules at one time ( Hiatt et al ., 1989 ; During et al ., 1990 ). However, in some cases it may be necessary to express several proteins in the same plant material, for example for an edible vaccine comprising several antigens along with a protein adjuvant. As discussed above, strategies are available to express multiple proteins from single vectors or to combine expressing lines. However, the development of plant artificial chromosomes should allow for many more proteins to be co-expressed ( Brown et al ., 2000 ). http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Plant Biotechnology Journal Wiley

Approaches to achieve high-level heterologous protein production in plants

Plant Biotechnology Journal , Volume 5 (1) – Jan 1, 2007

Loading next page...
1
 
/lp/wiley/approaches-to-achieve-high-level-heterologous-protein-production-in-NjgSXjXT1P

References (123)

Publisher
Wiley
Copyright
© 2006 Blackwell Publishing Ltd
ISSN
1467-7644
eISSN
1467-7652
DOI
10.1111/j.1467-7652.2006.00216.x
pmid
17207252
Publisher site
See Article on Publisher Site

Abstract

<h1>Introduction</h1> Genetically engineered microorganisms are a very important source of industrial and medicinal proteins. Over the last 30 years, the economic and safety advantages of recombinant microbial systems have allowed them to replace many native sources of proteins ( Cronin, 1997 ; Frank, 1998 ). Recombinant microbes have also enabled the production of alternative engineered protein forms with modified activities and properties ( Crabb and Bolin, 1999 ). However, microbes have certain limitations in the classes of proteins that can be economically produced, and in the post-translational processing that can be achieved. Thus, insect and mammalian cell cultures have also been utilized for eukaryotic protein production ( Lubiniecki and Lupker, 1994 ; Kost and Condreay, 1999 ), and transgenic animals ( Houdebine, 2002 ) are also being evaluated. However, with these systems, in particular cell cultures, production costs are prohibitively high for many proteins. Over the past decade plants have been developed as recombinant protein production systems ( Howard and Hood, 2005 ). Potentially, they have great advantages over microorganisms, and especially animal cell systems, for the costs of production ( Daniell et al ., 2001a ). Moreover, unlike bacteria, they are capable of eukaryotic post-translational modifications, most importantly glycosylation ( Hood et al ., 1997 ; Karnoup et al ., 2005 ). This can be achieved without the hyperglycosylation that is often observed with certain yeast systems ( Nakamura et al ., 1993 ; Montesino et al ., 1998 ). Plants are particularly well suited to the production of edible oral vaccines ( Streatfield and Howard, 2003 ) and industrial enzymes for biomass treatment ( Xue et al ., 2003 ; Teymouri et al ., 2004 ). Protein purification is not required and, in the case of industrial enzymes, volumes are anticipated to be very large. Several plant systems have been evaluated for the economic production of recombinant proteins. These can be categorized by the approach used to generate the production vehicle and by the type of tissue in which recombinant protein is produced ( Table 1 ). Three broad approaches have been followed to make heterologous proteins in plants. The most common approach, and one that is applicable to a wide range of species, has been to use biolistics or agrobacteria to generate stably transformed plants ( Davey et al ., 1989 ; Gelvin, 2003 ). Recombinant gene sequences are integrated into the host plant's nuclear genome, and the approach can be routinely applied, even to multisubunit proteins, such as antibodies ( Hiatt et al ., 1989 ). Where possible, agrobacterial infection is generally preferred, as the resulting transgenic plants are less subject to silencing ( De Wilde et al ., 2000 ). A second approach has been to use biolistics to generate transplastomic plants in which the plastid genome is modified to incorporate recombinant sequences ( Svab and Maliga, 1993 ). Homologous recombination targets the transgene to a specific locus, and transplastomic plants are not subject to silencing. Until recently, very few tissue types and species were amenable to this technology, but it has now been extended to more important crops ( Daniell et al ., 2005 ). Mitochondrial transformation is even more limited, with no reliable methodology yet developed, and so the production of recombinant proteins in plant mitochondria has not been pursued. As a third approach to produce foreign proteins in plants, some species have been inoculated with recombinant plant viral genomes. The engineered viral sequences are transiently expressed in infected plant tissues, usually leaves ( Porta and Lomonossoff, 2002 ; Canizares et al ., 2005 ). Transient recombinant protein production can also be accomplished using biolistics ( Godon et al ., 1993 ) or agrobacteria ( Hellens et al ., 2005 ), but, in these cases, expression tends to be impractically low. Depending on which of these approaches is followed to introduce recombinant sequences into plants, and depending on the species of plant utilized, foreign proteins can be produced in various plant tissue types and can even be secreted into surrounding media, greatly simplifying purification ( Borisjuk et al ., 1999 ; Fischer et al ., 1999 ). Thus, the tissue type for expression can greatly influence the costs of protein production ( Kusnadi et al ., 1997 ; Horn et al ., 2004 ). Other factors, such as the options for post-translational processing, potential for contamination with harmful secondary metabolites and the confidence for biological containment, differ between alternative plant expression systems and target tissues. However, production costs are the key issue for recombinant protein production, and several factors affect these costs. These include the upfront costs of generating recombinant plant material, the costs of growing and harvesting plant material, and the downstream processing and purification costs. The latter tend to increase with tissue complexity. High levels of recombinant gene expression can greatly reduce the amounts of plant biomass required and can simplify processing and purification of the product. Thus, an overriding concern for all plant systems is the achievement of high expression ( Nandi et al ., 2005 ). Some protein products, such as industrial enzymes for applications such as bleaching and edible vaccines for delivery to animals, may require only limited processing of plant material ( Hood et al ., 2003 ; Lamphear et al ., 2004 ). However, these products are relatively inexpensive, and so must be produced at high levels in plants to ensure economic viability. At the other extreme, more expensive products, such as pharmaceuticals for administration to humans and enzymes for pharmaceutical processing, require protein extraction and purification, greatly adding to the production costs ( Samyn-Petit et al ., 2001 ; Woodard et al ., 2003 ). These costs can be minimized through increasing expression. Thus, high-level expression is essential for economic recombinant protein production. In addition, for protein products that are not extracted and purified prior to use, expression must be sufficiently high to ensure efficacy. For example, for an edible vaccine, a sufficient dose of antigen to confer protection must be delivered in a quantity of plant tissue that can be practically ingested at a single sitting ( Lamphear et al ., 2004 ; Tacket et al ., 2004 ). Strategies have been developed to increase the levels of recombinant proteins for plant production systems. These focus on the synthesis and stability of recombinant nucleic acids and encoded proteins. The type of system used for protein production specifies the range of approaches that can be applied. With stable nuclear transgenics, fairly high levels of expression have been achieved for some recombinant proteins ( Hood et al ., 1997 ; Streatfield et al ., 2002 ), and there are many approaches available to boost expression. With transplastomic expression, very high levels of a few recombinant proteins have been achieved ( Daniell et al ., 2001b ; Kumar and Daniell, 2004 ), presumably due in part to the presence of up to 100 genomes per plastid and up to 100 plastids per cell. However, options to further boost expression are currently more limited than with nuclear transgenics. With transient viral systems, fairly high levels of expression have been achieved ( Kumagai et al ., 1993 ), and tools are available to further boost expression. In this article, the various approaches used to boost recombinant protein production levels in plants are reviewed ( Table 2 ). <h1>Molecular approaches to boost expression</h1> Several molecular approaches have been applied to boost heterologous gene expression. These target replication, transcription, transcript stability and translation. <h2>Boosting the replication of transiently transformed sequences</h2> With transiently expressing plant viral systems, recombinant proteins are encoded on an engineered viral vector. Most attention has focused on the use of positive-sense RNA viruses, in which, in addition to replication, the inoculating genome can be directly translated. In vitro -synthesized transcripts derived from cDNA clones of viral vectors are inoculated on to susceptible plant tissues. Foreign genes are generally inserted as additional reading frames into viral genomes rather than replacing native genes ( Canizares et al ., 2005 ). This overcomes the problems observed in maintaining and expressing viral sequences when native genes are deleted. However, with many viral systems, there appear to be size constraints on the foreign protein expressed. Cis -acting elements can have a major effect on expression. For example, viral heterologous sequences can influence the expression of reporter genes ( Shivprasad et al ., 1999 ). Sequences encoding proteins required for viral replication are generally present on the inoculating transcripts. However, these replication functions can alternatively be supplied in trans from sequences stably integrated into the nuclear genome of the host plant ( Taschner et al ., 1991 ; Sanchez-Navarro et al ., 2001 ; Yusibov et al ., 2002 ). This approach allows plants to be developed that constitutively express sufficiently high levels of the viral proteins required for replication, and has the added benefit of conferring a degree of biological containment on the inoculating viral sequences. Post-transcriptional gene silencing can be avoided by expressing replicase functions from inducible promoters ( Mori et al ., 2001 ). As an extension of this approach to incorporate viral sequences into the host genome, efforts are underway to use viral replication machinery to boost expression from viral cassettes integrated into the genomes of transgenic plants. The aim is for sequences encoding viral replicases and sequences encoding viral vector-expressed foreign proteins to be integrated into the host plant genome, allowing RNA transcribed from the integrated viral vector to be replicated to high levels. Similarly, a geminivirus expression system has been engineered that avoids repeated infection cycles each time recombinant protein is to be produced. In this case, viral sequences encoding foreign proteins have been co-transformed with a vector directing the constitutive expression of viral replicases by biolistic bombardment of a tobacco cell line ( Hefferon and Fan, 2004 ). High-level replication can be transiently achieved and foreign protein expression enhanced. Ongoing work with this system is focused on the development of stably transformed lines with the replicase functions under inducible control and encoded by the plant genome, and the recombinant viral sequences for foreign protein expression also inserted into the genome but subject to excision and replication. As an alternative to the engineering of the genome of the host plant to provide viral replication functions, viral replicons can be delivered through infection with Agrobacterium , a process termed ‘magnifection’ ( Gleba et al ., 2005 ). Using this system, foreign protein can be transiently expressed at up to 80% of total soluble protein. <h2>Boosting transcription</h2> Increasing the levels of transcription of stably transformed sequences has probably received the most attention of the various approaches to boost expression. Transcription is an early step in the process of generating recombinant protein, and thus improvements here are viewed as having great potential to increase yield. Several promoters have been tested, of both plant pathogen and plant origin. In addition, some synthetic promoters have been developed. As tobacco and Arabidopsis are generally relatively easily transformed, and leaf tissues provide most of the biomass with these species, promoters that drive relatively high levels of expression in dicot leaves were the tools of choice for much early phase research. The promoters tested included those of plant viruses and highly expressed dicot leaf genes. The cauliflower mosaic virus 35S promoter ( Odell et al ., 1985 ) was the mainstay of early phase work focused on expressing recombinant proteins in dicot leaves. Using this promoter, fairly high levels of accumulation of recombinant proteins have been achieved ( Fiedler et al ., 1997 ; Gutierrez-Ortega et al ., 2005 ). Other plant viral promoters have also been tested, including a cassava vein mosaic virus promoter ( Verdaguer et al ., 1996 ), the C1 promoter of cotton leaf curl Multan virus ( Xie et al ., 2003 ) and the promoter of component 8 of Milk vetch dwarf virus ( Shirasawa-Seo et al ., 2005 ). Most of the viral promoters tested also show activity in monocots ( Shirasawa-Seo et al ., 2005 ). Some plant viral promoters show tissue specificity ( Mazithulela et al ., 2000 ). However, the strongest expressing plant viral promoters are generally not tissue specific ( Odell et al ., 1985 ), and the expression of recombinant proteins in tissues other than the target tissue can be problematical for plant health and can make biological containment difficult. Agrobacterium tumefaciens promoters have also been used to express recombinant proteins in plants. These include the nopaline synthase ( Shaw et al ., 1984 ) and mannopine synthetase promoters. The nopaline synthase promoter is less active than the cauliflower mosaic virus 35S promoter ( Sanders et al ., 1987 ; Harpster et al ., 1988 ), but the mannopine synthetase promoter shows comparable or increased activity ( Stefanov et al ., 1991 ). Leaf-specific expression has been achieved using plant leaf promoters, such as that of the small subunit of ribulose-bisphosphate carboxylase, giving expression levels in excess of 1% of total soluble protein ( Dai et al ., 2000 ). Dicot seed promoters have also been assessed, and very high levels of foreign gene expression have been achieved using the arcelin 5-I promoter of common bean ( De Jaeger et al ., 2002 ). Using this system, an antibody single-chain variable fragment accumulated to approximately 36% of total soluble protein. As a result of the relative difficulty of transformation, less work has focused on monocot systems, even though the seed tissues of cereals are very well suited to protein accumulation and storage. However, several groups have pursued monocot systems, often using high-expressing plant promoters, such as the maize polyubiquitin-1 promoter ( Christensen et al ., 1992 ). Engineered versions of this promoter ( Streatfield et al ., 2004 ) have achieved fairly high levels of expression in seeds. However, polyubiquitin-1 is a constitutive promoter, and therefore several other seed-specific monocot promoters have also been assessed, including the embryo-specific maize globulin-1 promoter ( Belanger and Kriz, 1991 ) and the endosperm-specific maize 27-kDa zein and barley D hordein promoters ( Russell and Fromm, 1997 ; Horvath et al ., 2000 ). The expression levels achieved in the seed tissues of monocots using seed-specific promoters have exceeded those using constitutive promoters ( Hood et al ., 2003 ). Transplastomic expression requires the use of promoters that are active in the plastid, which is somewhat akin to a prokaryotic expression system. The promoters tested here are of highly expressed plastid genome sequences, including the constitutive promoter of the rRNA operon. Very high levels of expression have been achieved using this promoter ( Daniell et al ., 2001b ; Tregoning et al ., 2003 ). However, the high expression level here is primarily considered to be a consequence of having up to 10 000 plastid genomes per leaf cell. Foreign genes inserted into plant viral vectors for transient expression are under the transcriptional control of native plant viral promoters. As an alternative, foreign genes may be placed under the control of a viral subgenomic promoter. A subgenomic promoter can be duplicated to increase expression ( Chapman et al ., 1992 ). In addition, separate subgenomic promoters can be used for the foreign gene and the viral coat protein. In this way, deletion of sequences as a result of recombination between identical regulatory elements can be avoided ( Donson et al ., 1991 ). Several strategies have been followed to further boost transcription over that achieved with plant or plant viral promoters. Multiple copies of enhancer sequences from a highly active promoter, such as cauliflower mosaic virus 35S, can be stacked to boost expression ( Guerineau et al ., 1992 ). Further synthetic promoters have been developed that combine the most active sequences of multiple well-characterized natural promoters. For example, elements of the cauliflower mosaic virus 35S promoter and the Agrobacterium Ti plasmid mannopine synthetase promoter have been combined. In this case, the synthetic promoter is several-fold more active than either component ( Comai et al ., 1990 ). Similar results have been achieved with chimeric plant viral promoters ( Rance et al ., 2002 ). Another approach to boost transcript levels has been to stack transcription units. These units can be driven by repeats of the same promoter and terminator or, preferably, by different controlling elements to reduce the potential for recombination or silencing. The stacking of transcription units typically increases expression over the use of a single unit, and allows for multiple proteins to be expressed ( During et al ., 1990 ). However, transcriptional interference usually results in promoter activities being less than fully additive ( Thompson and Myatt, 1997 ). The orientation of the transcription units relative to one another affects the degree of interference observed, and interference can be eliminated using transcriptional blocker sequences between the units ( Padidam and Cao, 2001 ). An alternative way to express multiple copies of the same gene is to use synthetic bidirectional promoters with enhancer elements flanked on either side by core promoter elements ( Xie et al ., 2001 ; Li et al ., 2004 ). A further way to increase transcriptional activity is to include global regulatory sequences next to promoters on expression cassettes. These scaffold attachment or matrix attachment regions have been characterized as interacting with plant nuclear scaffolds in vitro , and are considered to place surrounding loci in locations suitable for the recruitment of transcription factors to promoters ( Spiker and Thompson, 1996 ). These global sequences can boost the expression of plant genes by an order of magnitude ( Allen et al ., 1993 ; Li et al ., 2002 ). In addition to approaches to boost transcription that focus on cis -acting elements, other strategies have been followed that utilize trans -acting factors. Transcription factors capable of activating expression from chosen cis -acting promoter sequences are co-expressed in transgenic plants ( Yang et al ., 2001 ). The transcription factors may bind directly to the promoters of the recombinant genes, or may interact with other factors binding these promoters. However, for coexpressed factors to have a positive effect on transcription, native levels of these factors must be limiting in the host plant. The expression unit encoding a transcription factor can be co-transformed together with the transcription unit for recombinant protein expression, an approach that has boosted expression by two- to fourfold ( Yang et al ., 2001 ). Alternatively, a transgenic line can be developed expressing elevated levels of a transcription factor, and this line can be transformed with a recombinant protein expression unit. As a further alternative, separate lines expressing a transcription factor and a recombinant expression unit can be crossed. A transcription factor can also be provided transiently by the inoculation of a transgenic line with plant viral sequences encoding the factor ( Hull et al ., 2005 ). Moreover, by expressing a transcription factor from a promoter that the factor can itself bind to, very high levels of the transacting factor can potentially be achieved through positive feedback ( Schwechheimer et al ., 2000 ). In addition to stable nuclear transgenic systems, the co-expression of trans -acting factors has been applied to transplastomic plants. For example, T7 RNA polymerase has been expressed from the nuclear genome and then targeted to the plastid, where it can transcribe sequences integrated into the plastid genome ( McBride et al ., 1994 ). <h2>Stabilizing the message and ensuring correct message processing</h2> As a recombinant message is transcribed, the stability of that message is important to ensure high-level expression. The non-translated region located downstream of the translation stop codon is critical for processing and should include signals targeting the message for polyadenylation. Message destabilizing sequences in the downstream region can greatly affect stability ( Green, 1993 ; Newman et al ., 1993 ), and these sequences must be avoided when preparing gene constructs to express high levels of recombinant proteins. Several plant viral and plant 3′ untranslated regions have been utilized to process messages for recombinant proteins without destabilizing these messages. They include the cauliflower mosaic virus 35S terminator and the potato proteinase inhibitor II terminator ( Hood et al ., 1997 ). For plant-based expression systems, eukaryotic genes are generally re-synthesized or engineered to remove introns. Sequences such as consensus intron splice sites, message destabilizing sequences and transcript termination sequences should be avoided in constructing these mini-genes. However, some plant intron sequences contribute positively to the expression level observed for their native genes, and these sequences can boost expression if inserted as synthetic introns into genes ( Fiume et al ., 2004 ). <h2>Boosting translation</h2> There are several approaches to increase translational activity for recombinant sequences. Several plant and plant viral 5′ non-translated regions have been used with the aim of increasing the rate of translation initiation. For example, the tobacco mosaic virus and potato virus X leader sequences boost recombinant protein levels ( Gallie et al ., 1987 ; Pooggin and Skryabin, 1992 ). In order to optimize translation, any sequences located immediately around the translation start site should be modified to fit the consensus initiation sequence, which varies between plant species ( Joshi et al ., 1997 ). Many of the genes expressed in recombinant plant systems, particularly those expressed by industrial groups, are synthesized de novo from oligonucleotides. This allows for fully codon-optimized genes to be designed to suit the expression host. Experiments with tobacco-expressed green fluorescent protein have demonstrated the benefit of codon optimization in plants ( Rouwendal et al ., 1997 ). Preferred codon usage differs for monocots vs. dicots, and is greatly different for genes expressed from plastid genomes. Codon optimization within approximately 40 amino acids of the N-terminus is particularly important for increasing recombinant protein production ( Batard et al ., 2000 ). Engineering the sequence here allows for most of the benefits of codon optimization, whilst minimizing gene construction costs. As another alternative to synthesizing whole genes, specific rare codons can be removed by site-specific mutagenesis. However, when optimizing a gene, extensive runs or localized concentrations of a specific codon should be avoided, so that the corresponding tRNA does not become rate limiting. Extensive predicted mRNA secondary structures that might hinder translation should also be avoided, as should internal ribosome entry sites that might compete for translation of the full message. Polycistronic messages offer an alternative to multiple transcription units for the expression of more than one recombinant protein in transgenic plants. Co-expression of two reporter genes together with the cauliflower mosaic virus translational activator allowed for comparable levels of each foreign protein to accumulate ( Futterer and Hohn, 1991 ). The expression of multiple foreign genes as polycistronic messages has also been achieved in plastids ( Staub and Maliga, 1995 ; Quesada-Vargas et al ., 2005 ). <h1>Genetic approaches to boost expression</h1> Several genetic approaches have been applied to boost expression, including increasing the transgene copy number through selfing or crossing, and introducing foreign genes into germplasm suited to their over-expression. <h2>Increasing transgene copy number</h2> With stable transgenics, taking a single transgene to homozygosity through selfing doubles transgene representation and typically boosts expression ( Zhong et al ., 1999 ). Moreover, crossing high-expressing transgenic lines arising from independent transformation events can boost transgene copy number and expression levels. This approach to achieve high gene copy numbers appears to have a reduced risk of gene silencing in comparison with making multiple gene insertions during transformation. <h2>Selection of germplasm</h2> Certain germplasm is well suited to the expression of proteins at high levels. For example, maize lines have been developed over many generations with elevated protein or oil levels in seeds ( Bhattramakki and Kriz, 1996 ; Ting et al ., 1996 ). High-protein lines have the potential for the production of more recombinant protein in the seed on a weight basis. In high-oil lines, the increased proportion of oil is primarily a result of increased embryo size. Therefore, the level of recombinant proteins expressed in the embryo is also increased on a seed weight basis. As maize embryos have a much higher proportion of soluble to insoluble protein compared with endosperm, the use of high-oil lines can also increase recombinant protein expression on a percentage of total soluble protein basis. The use of high-oil lines has been shown to boost recombinant protein expression approximately fourfold in maize ( Hood et al ., 2003 ). Direct transformation of these lines may not be a viable strategy, as maize transformation is very inefficient for these lines. Rather, a transgene introduced into a maize line receptive to transformation can then be introgressed into the desired germplasm. Moreover, the introgression of a transgene into a high-yielding agronomic line can significantly boost foreign protein expression ( Hood et al ., 2003 ), in addition to the anticipated increase in yield on a per acre basis. Certain mutants and engineered transgenic lines may also be well suited to recombinant protein production. For example, a transgenic maize line expressing a cytokinin-synthesizing enzyme under the control of a senescence-inducible promoter develops two normal-sized embryos, one at the expense of endosperm tissue ( Young et al ., 2004 ). Thus, as with high-oil lines, this transgenic line has the potential for increased production of embryo-expressed recombinant proteins. <h1>Protein accumulation and stability</h1> Several approaches have been followed that focus on the stability of foreign proteins expressed in plants. These include avoiding amino acid sequences that target proteins for degradation, targeting proteins to an organellar environment suitable for accumulation, directing expression in a tissue type or at a time suited to protein accumulation, and making fusions to improve stability. <h2>Avoidance of sequences directing protein degradation</h2> Specific amino acid sequences identifying proteins for rapid turnover should be avoided when pursuing high-level expression. For example, a version of the N-end rule elucidated in microbial and mammalian systems, by which cytoplasmic or nuclear proteins with certain amino acids at their N-terminus are rapidly degraded, is also relevant to plants ( Vierstra, 1996 ). Amino acids that target degradation should be avoided at the N-terminus, or recombinant proteins should be targeted to alternative organelles. <h2>Subcellular targeting of foreign proteins</h2> Recombinant proteins encoded by transgenes integrated into the nuclear genome can be targeted to specific subcellular sites by including signal sequences on the expression constructs. In the absence of a signal sequence, the recombinant protein will accumulate in the cytoplasm, and this has been the most common means of expression pursued. However, signal sequences targeting recombinant protein to the apoplast, the plastid, the mitochondrion, vacuolar compartments, and the nucleus have also been used ( Hood et al ., 1997 ; Logan and Leaver, 2000 ; Streatfield et al ., 2003 ). Combining an apoplast targeting sequence with an endoplasmic reticulum retention sequence results in the recombinant protein being sequestered in the endoplasmic reticulum ( Streatfield et al ., 2003 ). For some organelles, including plastids and mitochondria, alternative signal sequences offer suborganellar options for targeting recombinant protein. In cases in which recombinant proteins have been targeted to alternative subcellular sites, considerable variation in protein accumulation has been observed. For example, the receptor binding subunit of the heat-labile toxin of Escherichia coli has been expressed in six alternative subcellular locations in maize seed, with expression varying across four orders of magnitude between organelles ( Streatfield et al ., 2003 ). Factors that may affect the degree of accumulation in different subcellular locations include the biochemical environment of the compartment, for example pH, and the available space in the compartment for protein storage. Space may be a constraint for a highly expressed recombinant protein. For example, the expression of hepatitis B surface antigen in soybean cell culture, with the recombinant protein localized to the endoplasmic reticulum, resulted in dilation of this membrane network, possibly indicating an accumulation limit being reached ( Smith et al ., 2002 ). Signal sequences specifying subcellular targets are not always removed from recombinant proteins. This may be a consequence of vector design, as with the C-terminal endoplasmic retention signal, which is not cleaved from native proteins that carry it. However, even if a signal sequence is removed from its native protein, it may not be cleaved when fused to a recombinant protein with a different amino acid sequence immediately adjoining the fusion site. For example, adding the signal sequence of soybean vegetative storage protein to the N-terminus of hepatitis B surface antigen for expression in tobacco cell culture resulted in the accumulation of the antigen with attached signal peptide in the endoplasmic reticulum ( Sojikul et al ., 2003 ). Non-removal or imprecise cleavage of signal sequences may affect recombinant protein activity and may raise safety concerns for products that must pass through a regulatory path for licensing. Moreover, the use of native signal sequences of recombinant proteins in plant expression systems can result in unexpected targeting, as seen with the receptor binding subunit of the heat-labile toxin of E. coli expressed from its native signal sequence in maize endosperm tissue ( Chikwamba et al ., 2003 ). Apoplast targeting was anticipated, but protein accumulated in starch granules, suggestive of amyloplast targeting. The selection of organelle targets can also affect the post-translational processing of recombinant proteins. For example, proteins targeted through the endoplasmic reticulum can be glycosylated. If a recombinant protein is most strongly expressed in an organelle in which it will be glycosylated, but glycosylation would be detrimental to activity, one option is to mutate the normally glycosylated sites to alternative amino acids. Differences between animal and plant glycosylation patterns are being addressed in some recombinant plant systems. Anti-sense potato and tobacco lines have been generated that do not add a plant-specific moiety ( Wenderoth and von Schaewen, 2000 ), and transgenic tobacco lines have been developed with the capacity to add mammalian moieties normally absent from plant proteins ( Bakker et al ., 2001 ). For recombinant proteins synthesized in the plastid from sequences integrated into the plastid genome, suborganellar targeting to particular locations within this organelle can be considered. Given their endosymbiotic origin, plastids may be particularly well suited to the production of bacterial proteins. Recombinant proteins are not glycosylated, but correct disulphide bonding can be accomplished ( Staub et al ., 2000 ). In addition, other post-translational modifications, such as lipidation, can be achieved, as for the outer surface lipoprotein A of Borrelia burgdorferi expressed from the tobacco plastid genome ( Glenz et al ., 2006 ). With regard to protein processing, plant viral expression systems are more akin to nuclear than plastid systems. They allow for targeting and for glycosylation of recombinant proteins ( Kumagai et al ., 2000 ; Dirnberger et al ., 2001 ). <h2>Tissue-specific, temporal and inducible expression</h2> The choice of tissue for expression can greatly influence the accumulation and stability of recombinant proteins. Factors such as protein content, water content and presence of proteases and protease inhibitors affect foreign protein accumulation. The choice of tissue may also be influenced by the anticipated product application. For example, an edible tissue is suitable for an oral vaccine. As discussed above, tissue-specific promoters allow for the expression of recombinant proteins only in preferred target tissues, such as tobacco leaves ( Dai et al ., 2000 ) and cereal seeds ( Hood et al ., 2003 ). The timing of expression can also affect protein accumulation and stability. For example, during the development of cereal seeds, most of the storage tissues are laid down just prior to desiccation. Expression from seed promoters active at this time will favour a similar accumulation of foreign proteins prior to desiccation, and this should favour protein stability. The expression of recombinant proteins at this late stage is also less likely to negatively affect plant health or embryo development. An alternative strategy, applicable to fresh tissues such as developing seedlings and leaves, is to harvest tissue expressing recombinant protein when accumulation of the protein is optimal ( Rodriguez, 1999 ; Daniell et al ., 2001a ). The use of inducible promoters allows for expression in a target tissue only following a specific treatment, such as chemical spraying ( Aoyama and Chua, 1997 ; Zuo and Chua, 2000 ; Padidam, 2003 ). In this case, protein accumulation can be restricted to a fine timeframe to limit detrimental effects on plant health. In addition, some regulatory sequences boost activity under certain growth conditions, such that the bulk of foreign protein accumulates in a brief period prior to harvest. For example, the plastid untranslated leader sequence of the 32-kDa D1 polypeptide of photosystem II enhances translational activity on switching growth conditions from a light/dark cycle to continuous light, and this has been utilized to boost yields from plants grown under controlled light conditions ( Fernandez-San Millan et al ., 2003 ; Koya et al ., 2005 ). <h2>Protein fusions</h2> Protein fusions can address issues of subcellular targeting, stability, purification and activity of product. Expression of a foreign protein in plants as a fusion to a plant protein present at high levels in a subcellular storage compartment within seed tissues allows for the foreign protein to be targeted to the same location as the native protein, where it can stably accumulate. For example, foreign proteins have been fused to oleosins for expression in Brassica napus seeds, where they accumulate in oil bodies ( van Rooijen and Moloney, 1995 ). Fusion of a foreign protein to the C-terminus of a single ubiquitin coding unit has been used to stabilize expressed recombinant proteins in several recombinant systems. In plants, this has increased recombinant protein levels 10-fold ( Garbarino et al ., 1995 ). The ubiquitin unit is subsequently removed in vivo . As a further alternative, polyproteins have been co-expressed in plants with a plant viral protease. Cleavage by the protease at a recognition site engineered between sequences encoding polypeptide chains yields distinct recombinant proteins ( Dasgupta et al ., 1998 ). These proteins can be targeted to different organelles. Fusion of a foreign protein or peptide to a second recombinant protein that has been shown to be stably expressed in plants can act to stabilize the target protein or peptide. The fusion of a tuberculosis antigen to the receptor binding subunit of the heat-labile toxin of E. coli follows this strategy ( Rigano et al ., 2004 ). An added advantage in this case is that the heat-labile toxin subunit can direct the fused antigen to ganglioside receptors on the surface of the gut, so facilitating delivery of the product to the target tissue. In addition, fusion of a foreign protein to a recombinant protein with defined binding characteristics or to an affinity tag can simplify purification. This has been applied in insect cells and other recombinant systems with fusions to molecules such as avidin ( Airenne et al ., 1999 ). However, later cleavage from the protein carrier or peptide tag may be necessary to release the desired product, so adding to the downstream processing costs. With plant viral expression systems, a peptide or protein is often expressed as a fusion to a viral coat protein ( Canizares et al ., 2005 ). Such fusions serve to stabilize the expressed protein or peptide, and also simplify purification, as plant viral particles can easily be separated from plant tissues. With plant-based vaccines the plant viral particles may also enhance the immunogenicity of the incorporated antigens. However, there is generally a strict size limit on the length of foreign sequence that can be fused without disrupting the assembly of the viral protein coat. Typically, only up to a few tens of amino acids can be fused to the coat protein ( Porta et al ., 2003 ). The isoelectric point of the sequence fused to the viral coat protein also affects the ability of the fusion to systemically infect plant tissues. With cowpea mosaic virus, the addition of a peptide with a high isoelectric point has a negative impact on the yield ( Porta et al ., 2003 ). A strategy to overcome the size limit on viral coat fusions, but still to obtain particle assembly, is to design a fusion with the 16-amino-acid 2A peptide of foot-and-mouth disease virus positioned between the foreign protein and the coat protein. A proportion of the fusion is co-translationally cleaved to yield coat protein, allowing for the assembly of particles containing both coat protein and uncleaved fusion molecules ( Cruz et al ., 1996 ; Smolenska et al ., 1998 ). Use of the 2A peptide has also been extended to stable transgenic plants to yield multiple proteins from a single transgene that can be independently targeted to different organelles ( El Amrani et al ., 2004 ). <h2>Expression of zymogens</h2> The expression of proteases in plants offers a particular challenge, as successful expression is likely to result in the degradation of plant proteins and poor plant health. Organellar targeting and tissue-specific expression may serve to limit negative effects, but an alternative strategy was followed to express commercial levels of bovine trypsin in maize ( Woodard et al ., 2003 ). In this case, expression was achieved without excessive proteolysis of plant proteins and poor plant health by producing the zymogen form of the protease. This strategy potentially calls for an additional proteolytic cleavage step during product purification, but, in the case of trypsinogen, the zymogen form of the enzyme was converted to active trypsin either in seed tissues or during protein extraction from seed ( Woodard et al ., 2003 ). <h1>Future prospects</h1> The expression of foreign proteins in plants with a view to commercial production has attracted considerable attention over the past decade. However, only a few small-scale products have so far reached the market. These are pure protein products that require cost-effective processing and purification schemes for commercial viability, and currently they are not produced economically on a large scale ( Hood et al ., 1997 ; Woodard et al ., 2003 ). Advances are required to significantly boost expression further, but, to date, many of the available tools discussed above have not been put together to optimize the expression of a single protein product. This stacking of approaches will probably be required to produce commercial products, although this can be limited by access to the necessary intellectual property. Moreover, when implementing new strategies and combining currently available approaches, care must be taken to minimize potential negative influences, in particular gene silencing ( De Wilde et al ., 2000 ). In the near term, enzymes for large-scale industrial processes and antigens for oral animal vaccines are the most likely plant-expressed products to be commercially viable. Purification is not necessary for these products, and regulatory paths are less stringent than for pharmaceuticals ( Streatfield, 2005 ). With edible vaccines, expression levels of some antigens are already sufficient for economic products, provided that promising early-stage efficacy data are confirmed in later trials ( Lamphear et al ., 2004 ). With industrial enzymes, further increases in expression will be necessary. Much more attention will also need to be applied to technical hurdles to ensure correct post-translational processing and protein stability in plant tissues. Post-translational processing is likely to be particularly important for pharmaceuticals, for which safety and efficacy concerns over issues such as plant-specific glycosylation may arise. Moreover, recombinant protein stability will be affected by host proteases. For recombinant protein production in microbes, host lines have been developed lacking proteases ( Meerman and Georgiou, 1994 ). A similar approach with plants, achieved through gene knock-out or silencing, should serve to increase recombinant protein stability. Foreign protein expression in plants has largely focused on the expression of one or, in a few cases such as antibodies, a few molecules at one time ( Hiatt et al ., 1989 ; During et al ., 1990 ). However, in some cases it may be necessary to express several proteins in the same plant material, for example for an edible vaccine comprising several antigens along with a protein adjuvant. As discussed above, strategies are available to express multiple proteins from single vectors or to combine expressing lines. However, the development of plant artificial chromosomes should allow for many more proteins to be co-expressed ( Brown et al ., 2000 ).

Journal

Plant Biotechnology JournalWiley

Published: Jan 1, 2007

Keywords: foreign gene expression; heterologous gene expression; industrial protein production; plant-based vaccines; plant-produced pharmaceuticals; recombinant protein production

There are no references for this article.