Heterologous production of 10-methylstearic acid by cells expressing recombinant methyltransferase

Heterologous production of 10-methylstearic acid by cells expressing recombinant methyltransferase
US11236373

Disclosed herein are cells, nucleic acids, and proteins that can be used to produce branched (methyl)lipids, such as 10-methylstearic acids, and compositions that include such lipids. cells disclosed herein comprise methyltransferase and/or reductase genes from bacteria of the class Gammaproteobacteria, which encode enzymes capable of catalyzing the production of branched (methyl)lipids from unbranched, unsaturated lipids. Saturated branched (methyl)lipids produced using embodiments of the present invention have favorable low-temperature fluidity and favorable oxidative stability, which are desirable properties for lubricants and specialty fluids.

PTO Wrapper PDF
Dossier Espace Google

Patent 11236373
Priority Sep 20 2017
Filed Sep 20 2018
Issued Feb 01 2022
Expiry Sep 20 2038
Inventors Blitzblau,…
Assg.orig NOVOGY, IN…
Assg.curr GINKGO BIO…
Entity Large
Referenced by 0
References 3
Maint.: currently ok

CROSS REFERENCE TO R…
SEQUENCE LISTING
BACKGROUND OF THE IN…
A. Field of the Inve…
B. Description of Re…
SUMMARY OF THE INVEN…
BRIEF DESCRIPTION OF…
DETAILED DESCRIPTION…
A. Definitions
B. Microbe Engineeri…
C. Exemplary Cells, …
EXAMPLES
Example 1
Identification of tm…
Example 2
E. coli Expre…
Example 3
tmpB Gene Expression…
Example 4
tmpB and tmpA Sequen…

21. A nucleic acid comprising a recombinant tmpb gene encoding a tmpb protein from a bacterium of the genus Desulfobacter, Desulfobacula, Marinobacter, Thiohalospira, Thiohalorhabdus, Desulfotignum, or Halofilum and a first promoter operably linked to the recombinant tmpb gene.

1. A cell comprising an exogenous tmpb gene encoding a tmpb protein from a bacterium of the genus Desulfobacter, Desulfobacula, Marinobacter, Thiohalospira, Thiohalorhabdus, Desulfotignum, or Halofilum and either a branched (methyl)lipid or an exomethylene-substituted lipid, wherein

(1) the branched (methyl)lipid is a carboxylic acid, carboxylate, ester, thioester, or amide, and

the branched (methyl)lipid comprises a saturated or unsaturated branched aliphatic chain comprising a branching methyl group; or

(2) the exomethylene-substituted lipid is a carboxylic acid, carboxylate, ester, thioester, or amide,

the exomethylene-substituted lipid comprises a branched aliphatic chain, and

the aliphatic chain is substituted with an exomethylene group.

2. The cell of claim 1, wherein the branched (methyl)lipid or the exomethylene-substituted lipid is a fatty acid from 14 to 18 carbons long with a methyl moiety in the Δ9, Δ10, or Δ11 position.

3. The cell of claim 2, wherein the branched (methyl)lipid is 10-methylstearate, or an ester, thioester, or amide thereof or the exomethylene-substituted lipid is 10-methylenestearate, or an ester, thioester, or amide thereof.

4. The cell of claim 1, wherein the tmpb protein is Desulfobacula balticum enzyme tmpb, Marinobacter hydrocarbonclasticus enzyme tmpb, Thiohalospira halophila enzyme tmpb, Desulfobacter curvatus enzyme tmpb, Desulfobacter phenolica enzyme tmpb, Desulfobacula toluolica enzyme tmpb, Desulfobacter postgatei enzyme tmpb, Halojilum ochraceum enzyme tmpb, or Marinobacter aquaeolei enzyme tmpb.

5. The cell of claim 1, wherein the tmpb protein has at least 90% sequence identity to SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:6, SEQ ID NO:8, SEQ ID NO:10, SEQ ID NO:12, SEQ ID NO:14, SEQ ID NO:16, or SEQ ID NO:18.

6. The cell of claim 1, further comprising a recombinant tmpA gene encoding a reductase tmpA protein from a bacterium of the genus Desulfobacter, Desulfobacula, Marinobacter, Thiohalospira, Thiohalorhabdus, Desulfotignum, or Halofilum.

7. The cell of claim 6, wherein the tmpA protein is capable of converting a methylene-substituted lipid to a methyl-substituted lipid.

8. The cell of claim 7, wherein the methylene-substituted lipid is a fatty acid from 14 to 18 carbons long with a methylene substitution in the Δ9, Δ10, or Δ11 position and the methyl-substituted lipid is a fatty acid from 14 to 18 carbons long with a methyl moiety in the Δ9, Δ10, or Δ11 position.

9. The cell of claim 6, wherein the tmpA protein is selected from Desulfobacula balticum enzyme tmpA, Marinobacter hydrocarbonclasticus enzyme tmpA, Thiohalospira halophila enzyme tmpA, Desulfobacter curvatus enzyme tmpA, Desulfobacter phenolica enzyme tmpA, Desulfobacula toluolica enzyme tmpA, Desulfobacter postgatei enzyme tmpA, Halofilum ochraceum enzyme tmpA, and Marinobacter aquaeolei enzyme tmpA.

10. The cell of claim 6, wherein the tmpA protein has at least 90% sequence identity to SEQ ID NO:20, SEQ ID NO:22, SEQ ID NO:24, SEQ ID NO:26, SEQ ID NO:28, SEQ ID NO:30, SEQ ID NO:32, SEQ ID NO:34, SEQ ID NO:36.

11. The cell of claim 6, wherein the tmpb gene and the tmpA gene are included in a single open reading frame encoding a fusion protein comprising both the tmpb protein and the tmpA protein.

12. The cell of claim 1, wherein to 15% by weight of the fatty acids of the cell are 10-methyl fatty acids or 10-methylene fatty acids.

13. The cell of claim 1, wherein the cell lacks an endogenous methyltransferase gene.

14. The cell of claim 1, wherein the cell lacks the endogenous ability to produce the branched (methyl)lipid or exomethylene-substituted lipid.

15. The cell of claim 1, wherein the cell is a bacterial cell, a fungal cell, an algal cell, a mold cell, a plant cell, or a yeast cell.

16. The cell of claim 1, wherein the cell is a fungal cell, an algal cell, a mold cell, a plant cell, or a yeast cell.

17. The cell of claim 1, wherein the cell is a yeast cell.

18. The cell of claim 1, wherein the cell is selected from the group consisting of Arxula, Aspergillus, Aurantiochytrium, Candida, Claviceps, Cryptococcus, Cunninghamella, Geotrichum, Hansenula, Kluyveromyces, Kodamaea, Leucosporidiella, Lipomyces, Mortierella, Ogataea, Pichia, Prototheca, Rhizopus, Rhodosporidium, Rhodotorula, Saccharomyces, Schizosaccharomyces, Tremella, Trichosporon, Wickerhamomyces, and Yarrowia.

19. The cell of claim 1, wherein the cell is selected from the group consisting of Arxula adeninivorans, Aspergillus niger, Aspergillus orzyae, Aspergillus terreus, Aurantiochytrium limacinum, Candida utilis, Claviceps purpurea, Cryptococcus albidus, Cryptococcus curvatus, Cryptococcus ramirezgomezianus, Cryptococcus terreus, Cryptococcus wieringae, Cunninghamella echinulata, Cunninghamella japonica, Geotrichum fermentans, Hansenula polymorpha, Kluyveromyces lactis, Kluyveromyces marxianus, Kodamaea ohmeri, Leucosporidiella creatinivora, Lipomyces lipofer, Lipomyces starkeyi, Lipomyces tetrasporus, Mortierella isabellina, Mortierella alpina, Ogataea polymorpha, Pichia ciferrii, Pichia guilliermondii, Pichia pastoris, Pichia stipites, Prototheca zopfii, Rhizopus arrhizus, Rhodosporidium babjevae, Rhodosporidium toruloides, Rhodosporidium paludigenum, Rhodotorula glutinis, Rhodotorula mucilaginosa, Saccharomyces cerevisiae, Schizosaccharomyces pombe, Tremella enchepala, Trichosporon cutaneum, Trichosporon fermentans, Wickerhamomyces ciferrii, and Yarrowia lipolytica.

20. A method of producing a branched (methyl)lipid or exomethylene-substituted lipid, comprising contacting the cell of claim 1 with a substrate fatty acid, methionine, or both a substrate fatty acid and methionine, wherein the substrate fatty acid comprises a fatty acid from 14 to 18 carbons long with a double bond in the Δ9, Δ10, or Δ11 position.

22. The nucleic acid of claim 21, wherein the tmpb gene has at least 80% sequence identity to SEQ ID NO:1, SEQ ID NO:3, SEQ ID NO:5, SEQ ID NO:7, SEQ ID NO:9, SEQ ID NO:11, SEQ ID NO:13, SEQ ID NO:15, or SEQ ID NO:17.

23. The nucleic acid of claim 21, wherein the tmpb gene encodes a protein having at least 90% sequence identity to SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:6, SEQ ID NO:8, SEQ ID NO:10, SEQ ID NO:12, SEQ ID NO:14, SEQ ID NO:16, or SEQ ID NO:18.

24. The nucleic acid of claim 21, wherein the tmpb gene is codon-optimized for expression in yeast, algae, or plants.

25. The nucleic acid of any one of claim 21, further comprising a tmpA gene encoding a tmpA protein from a bacterium of the genus Desulfobacter, Desulfobacula, Marinobacter, Thiohalospira, Thiohalorhabdus, Desulfotignum, or Halofilum.

26. The nucleic acid of claim 25, wherein the reductase tmpA gene has at least 80% sequence identity to SEQ ID NO:19, SEQ ID NO:21, SEQ ID NO:23, SEQ ID NO:25, SEQ ID NO:27, SEQ ID NO:29, SEQ ID NO:31, SEQ ID NO:33, or SEQ ID NO:35.

27. The nucleic acid of claim 25, wherein the tmpA gene encodes a protein having at least 90% sequence identity to SEQ ID NO:20, SEQ ID NO:22, SEQ ID NO:24, SEQ ID NO:26, SEQ ID NO:28, SEQ ID NO:30, SEQ ID NO:32, SEQ ID NO:34, or SEQ ID NO:36.

28. The nucleic acid of claim 25, wherein the tmpA gene is fused in frame with the tmpb gene.

29. The nucleic acid of claim 28, further comprising a nucleic acid linker sequence between the tmpb gene and the tmpA gene, wherein the nucleic acid linker sequence encodes a linker peptide between the tmpb protein and the tmpA protein.

30. The nucleic acid of claim 25, wherein the tmpA gene is operably linked to a second promoter.

31. The nucleic acid of claim 21, wherein the promoter is a yeast promoter, an algae promoter, or a plant promoter.

32. The nucleic acid of claim 21, wherein the promoter is a yeast promoter.

CROSS REFERENCE TO RELATED APPLICATIONS

This application is a National Phase Application under 35 U.S.C. § 371 of International Application No. PCT/US2018/051919, filed Sep. 20, 2018, which claims the benefit of priority to U.S. Provisional Patent Application Ser. No. 62/561,136 filed Sep. 20, 2017, each of which are hereby incorporated by reference in their entirety.

This application is related to U.S. Ser. No. 15/710,734 and PCT/US17/52491 both filed Sep. 20, 2017.

SEQUENCE LISTING

The instant application contains a Sequence Listing which has been filed electronically in ASCII format and is hereby incorporated by reference in its entirety. Said ASCII copy, created on Sep. 20, 2018, is named novgp0006wo_sequencelisting.txt and is 124,309 bytes in size.

BACKGROUND OF THE INVENTION

A. Field of the Invention

The invention generally concerns production of branched (methyl)lipids by cells expressing recombinant methyltransferases and/or reductases derived from Gammaproteobacteria.

B. Description of Related Art

Fatty acids derived from agricultural plant and animal oils find use as industrial lubricants, hydraulic fluids, greases, and other specialty fluids in addition to oleochemical feedstocks for processing. The physical and chemical properties of these fatty acids result in large part from their carbon chain length and number of unsaturated double bonds. Fatty acids are typically 16:0 (sixteen carbons, zero double bonds), 16:1 (sixteen carbons, 1 double bond), 18:0, 18:1, 18:2, or 18:3. Importantly, fatty acids with no double bonds (saturated) have high oxidative stability, but they solidify at low temperature. Double bonds improve low-temperature fluidity, but decrease oxidative stability. This trade-off poses challenges for lubricant and other specialty-fluid formulations because consistent long term performance (high oxidative stability) over a wide range of operating temperatures is desirable. High 18:1 (oleic) fatty acid oils provide low temperature fluidity with relatively good oxidative stability. Accordingly, several commercial products, such as high oleic soybean oil, high oleic sunflower oil, and high oleic algal oil, have been developed with high oleic compositions. Oleic acid is an alkene, however, and subject to oxidative degradation.

A superior alternative is the addition of a fully saturated methyl branch to the fatty acid chain. This creates a similar melting-temperature depression as a double bond, but with no decrease in oxidative stability versus fully saturated linear fatty acids. Methyl branches located near the middle of the fatty acid chain have the largest melting-temperature depression. Several chemical processes have been explored to introduce methyl branches; however, the preferred industrial method results in random placement of the methyl branch and creates a substantial amount of by-product. There remains a need for efficient and economical processes of producing branched (methyl)lipids.

SUMMARY OF THE INVENTION

Disclosed herein are cells, nucleic acids, and proteins that can be used to produce branched (methyl) lipids, such as 10-methylstearic acids, and compositions that include such lipids. Saturated branched (methyl)lipids produced using embodiments of the present invention have favorable low-temperature fluidity and favorable oxidative stability, which are desirable properties for lubricants and specialty fluids.

Various aspects relate to nucleic acids comprising a recombinant tmpB gene encoding a methyltransferase protein and/or a recombinant tmpA gene encoding a reductase protein. The methyltransferase protein and/or reductase protein may be proteins expressed by species of the class Gammaproteobacteria (phylum, Proteobacteria), and the recombinant tmpB gene and/or recombinant tmpA gene may be codon-optimized for expression in a different phylum of bacteria or in eukaryotes (e.g., yeast, such as Arxula adeninivorans (also known as Blastobotrys adeninivorans or Trichosporon adeninivorans), Saccharomyces cerevisiae, or Yarrowia lipolytica). The recombinant tmpB gene or recombinant tmpA gene may be operably-linked to a promoter capable of driving expression in a phylum of bacteria other than Gammaproteobacteria or in eukaryotes (e.g., yeast). The nucleic acid may be a plasmid or a chromosome.

Some aspects relate to a cell comprising a nucleic acid as described herein. The cell may comprise a branched (methyl)lipid, such as 10-methylstearic acid, and/or an exomethylene-substituted lipid, such as 10-methylenestearic acid. The cell may be a eukaryotic cell, such as an algae cell, yeast cell, or plant cell.

Some aspects relate to a composition produced by cultivating a cell culture comprising cells as described herein. The oil composition may comprise a branched (methyl)lipid, such as 10-methylstearic acid, and/or an exomethylene-substituted lipid, such as 10-methylenestearic acid. In some embodiments, the oil composition is produced by cultivating a cell culture and recovering the oil composition from the cell culture, wherein the oil composition comprises 10-methyl fatty acids, and wherein the 10-methyl fatty acids comprise at least about 1% by weight of the total fatty acids in the oil composition. In some embodiments, the 20-methyl fatty acids comprise at least about 15% by weight of the total fatty acids in the oil composition.

Some aspects relate to a method of producing an oil composition, the method comprising: cultivating a cell culture comprising any of the cells disclosed herein; and recovering the oil composition from the cell culture. In some embodiments, the method further comprises contacting the cell culture with a substrate comprising a fatty acid from 14 to 18 carbons long with a double bond in the Δ9, Δ10, or Δ11 position. In some embodiments, recovering the oil composition from the cell culture comprises recovering lipids that have been secreted by the cell. In some embodiments, producing the oil composition comprises performing chemical reactions or causing chemical reactions to be performed in which oleic acid and methionine substrates are converted to 10-methylenestearic acid, wherein the chemical reactions are catalyzed by a tmpB protein. In some embodiments, producing the oil composition comprises performing chemical reactions or causing chemical reactions to be performed in which 10-methylene stearic acid is reduced to 10-methylstearic acid, wherein the chemical reactions are catalyzed by tmpA protein. In some embodiments, the reduction is performed using NADPH, ferredoxin, flavodoxin, rubredoxin, cytochrome c, or combinations thereof as reducing agents. In any of the methods disclosed herein that involve reduction reactions any one of, or any combination of, NADPH, ferredoxin, flavodoxin, rubredoxin, and cytochrome c may be used.

Other objects, features and advantages of the present invention will become apparent from the following figures, detailed description, and examples. It should be understood, however, that the figures, detailed description, and examples, while indicating specific embodiments of the invention, are given by way of illustration only and are not meant to be limiting. Additionally, it is contemplated that changes and modifications within the spirit and scope of the invention will become apparent to those skilled in the art from this detailed description.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 depicts one possible mechanism for the conversion of oleic acid to 10-methylstearic acid. An oleic acid substrate may be present as an acyl chain of a glycerolipid or phospholipid. A methionine substrate, which donates the methyl group, may be present as S-adenosyl methionine. The oleic acid and methionine substrates may be converted to 10-methylenestearic acid (e.g., present as an acyl chain of a glycerolipid or phospholipid) and homocysteine (e.g., present as S-adenosyl homocysteine). This reaction may be catalyzed by a tmpB protein as described herein, infra. 10-methylenestearic acid (e.g., present as an acyl chain of a glycerolipid or phospholipid) may be reduced to 10-methylstearic acid. The reduction may be catalyzed by a tmpA protein as describe herein, infra, for example, without limitation, using NADPH as a reducing agent. Other examples of the reducing agent may include, without limitation, ferredoxin, flavodoxin, rubredoxin, cytochrome c, or combinations thereof. The language of the specification and claims, however, is not limited to any particular reaction mechanism.

FIG. 2 shows the occurrence of cyclopropane fatty acyl phospholipid synthase (cfa) homologs and 10-methylpalmitic acid (10Me16) in certain Gammaproteobacteria with sequenced genomes and observed lipid profiles.

FIGS. 3A-3B depict maps of the following vectors, which encode a tmp operon: pNC1071 (SEQ ID NO:39), which includes a Desulfobacter postgatei tmp operon; pNC1072 (SEQ ID NO:40), which includes a Desulfobacula balticum tmp operon, pNC1073 (SEQ ID NO:41), which includes a Desulfobacula toluolica tmp operon; pNC1074 (SEQ ID NO:42), which includes a Marinobacter hydrocarbonclasticus tmp operon; and pNC1076 (SEQ ID NO:43), which includes a Thiohalospira halophila tmp operon.

FIG. 4 is a graph showing the percentage of 10-methylene fatty acids in Saccharomyces cerevisiae transformed with plasmids expressing tmpB from the indicated species: D. postgatei (D.po.), D. balticum (D.ba.), D. toluolica (D.to.), M. hydrocarbonclasticus (M.hy.) and T. halophila (T.ha.), or an empty vector control (−).

FIG. 5 is a graph showing the percentage of 10-methylene fatty acids in Yarrowia lipolytica transformed with plasmids expressing tmpB from the indicated species: D. postgatei (D.po.), D. balticum (D.ba.), D. toluolica (D.to.), M. hydrocarbonclasticus (M.hy.) and T. halophila (T.ha.), or an empty vector control (−).

FIG. 6 shows the fatty acid profile of E. coli Top10 cells with plasmids pNC1071, pNC1072, pNC1073, pNC1074, pNC1076, and pNC53 (empty control vector) grown in LB medium. Percentage values show the weight percent of the indicated fatty acid as a percentage of all fatty acids. 14:0=Myristic acid, 16:0=Palmitic acid, 16:1Δ9=palmitoleic acid, 16:0cyc=17Δ,cis-9,10-methylenehexadecanoic acid, 10-methylene 16:0=10-methylene hexadecenoic acid, 18:1Δ11=vaccenic acid, 18:0=stearic acid, SD=standard deviation.

FIGS. 7A-7D show a CLUSTAL OMEGA alignment of tmpB protein sequences encoded by the tmpB genes from Desulfobacula balticum, Marinobacter hydrocarbonclasticus, Thiohalospira halophila, Desulfobacter curvatus, Desulfobacter phenolica, Desulfobacula toluolica, Desulfobacter postgatei, Halofilum ochraceum, and Marinobacter aquaeolei, along with the cyclopropane fatty acid synthase (Cfa) enzyme from Escherichia coli.

FIGS. 8A-8D show a CLUSTAL OMEGA alignment tmpA protein sequences encoded by the tmpA genes from Desulfobacula balticum, Marinobacter hydrocarbonclasticus, Thiohalospira halophila, Desulfobacter curvatus, Desulfobacter phenolica, Desulfobacula toluolica, Desulfobacter postgatei, Halofilum ochraceum, and Marinobacter aquaeolei, along with the Archaeoglobus fulgidus geranylgeranyl reductase protein AF0464.

DETAILED DESCRIPTION OF THE INVENTION

A. Definitions

The articles “a” and “an” are used herein to refer to one or to more than one (i.e., to at least one) of the grammatical object of the article. By way of example, “an element” means one element or more than one element.

The term “biologically-active portion” refers to an amino acid sequence that is less than a full-length amino acid sequence, but exhibits at least one activity of the full length sequence. For example, a biologically-active portion of a methyltransferase may refer to one or more domains of tmpB having biological activity for converting oleic acid (e.g., a phospholipid comprising an ester of oleate) and methionine (e.g., S-adenosyl methionine) into 10-methylenestearic acid (e.g., a phospholipid comprising an ester of 10-methylenestearate). A biologically-active portion of a reductase may refer to one or more domains of tmpA having biological activity for converting 10-methylenestearic acid (e.g., a phospholipid comprising an ester of 10-methylenestearate) and a reducing agent (e.g., ferrodoxin, flavodoxin, rubredoxin, cytochrome c, NADH, NADPH, FAD, FADH₂, FMNH₂) into 10-methylstearic acid (e.g., a phospholipid comprising an ester of 10-methylstearate). Biologically-active portions of a protein include peptides or polypeptides comprising amino acid sequences sufficiently identical to or derived from the amino acid sequence of the protein, e.g., the amino acid sequence set forth in SEQ ID NO:2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, or 36, that include fewer amino acids than the full length protein, and exhibit at least one activity of the protein, especially methyltransferase or reductase activity. A biologically-active portion of a protein may comprise, comprise at least, or comprise at most, for example, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168, 169, 170, 171, 172, 173, 174, 175, 176, 177, 178, 179, 180, 181, 182, 183, 184, 185, 186, 187, 188, 189, 190, 191, 192, 193, 194, 195, 196, 197, 198, 199, 200, 201, 202, 203, 204, 205, 206, 207, 208, 209, 210, 211, 212, 213, 214, 215, 216, 217, 218, 219, 220, 221, 222, 223, 224, 225, 226, 227, 228, 229, 230, 231, 232, 233, 234, 235, 236, 237, 238, 239, 240, 241, 242, 243, 244, 245, 246, 247, 248, 249, 250, 251, 252, 253, 254, 255, 256, 257, 258, 259, 260, 261, 262, 263, 264, 265, 266, 267, 268, 269, 270, 271, 272, 273, 274, 275, 276, 277, 278, 279, 280, 281, 282, 283, 284, 285, 286, 287, 288, 289, 290, 291, 292, 293, 294, 295, 296, 297, 298, 299, 300, 301, 302, 303, 304, 305, 306, 307, 308, 309, 310, 311, 312, 313, 314, 315, 316, 317, 318, 319, 320, 321, 322, 323, 324, 325, 326, 327, 328, 329, 330, 331, 332, 333, 334, 335, 336, 337, 338, 339, 340, 341, 342, 343, 344, 345, 346, 347, 348, 349, 350, 351, 352, 353, 354, 355, 356, 357, 358, 359, 360, 361, 362, 363, 364, 365, 366, 367, 368, 369, 370, 371, 372, 373, 374, 375, 376, 377, 378, 379, 380, 381, 382, 383, 384, 385, 386, 387, 388, 389, 390, 391, 392, 393, 394, 395, 396, 397, 398, 399, 400, 401, 402, 403, 404, 405, 406, 407, 408, 409, 410, 411, 412, 413, 414, 415, 416, 417, 418, 419, 420, 421, 422, 423, 424, 425, 426, 427, 428, 429, 430, 431, 432, 433, 434, 435, 436, 437, 438, 439, 440, 441, 442, 443, 444, 445, 446, 447, 448, 449, 450, 451, 452, 453, 454, 455, 456, 457, 458, 459, 460, 461, 462, 463, 464, 465, 466, 467, 468, 469, 470, 471, 472, 473, 474, 475, 476, 477, 478, 479, 480, 481, 482, 483, 484, 485, 486, 487, 488, 489, 490, 491, 492, 493, 494, 495, 496, 497, 498, 499, 500, or more amino acids, or any range derivable therein. Typically, biologically-active portions comprise a domain or motif having a catalytic activity, such as catalytic activity for producing 10-methylenestearic acid or 10-methylstearic acid. A biologically-active portion of a protein includes portions of the protein that have the same activity as the full-length peptide and every portion that has more activity than background. For example, a biologically-active portion of an enzyme may have, have at least, or have at most 0.1%, 0.5%, 1%, 2%, 3%, 4%, 5%, 10%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, 99.5%, 99.6%, 99.7%, 99.8%, 99.9%, 100%, 100.1%, 100.2%, 100.3%, 100.4%, 100.5%, 100.6%, 100.7%, 100.8%, 100.9%, 101%, 105%, 110%, 115%, 120%, 125%, 130%, 135%, 140%, 145%, 150%, 160%, 170%, 180%, 190%, 200%, 220%, 240%, 260%, 280%, 300%, 320%, 340%, 360%, 380%, 400% or higher activity relative to the full-length enzyme (or any range derivable therein). A biologically-active portion of a protein may include portions of a protein that lack a domain that targets the protein to a cellular compartment.

The terms “codon optimized” and “codon-optimized for the cell” refer to coding nucleotide sequences (e.g., genes) that have been altered to substitute at least one codon that is relatively rare in a desired host cell with a synonymous codon that is relatively prevalent in the host cell. Codon optimization thereby allows for better utilization of the tRNA of a host cell by matching the codons of a recombinant gene with the tRNA of the host cell. For example, the codon usage of the species of Gammaproteobacteria (prokaryotes) varies from the codon usage of yeast (eukaryotes). The translation efficiency in a yeast host cell of an mRNA encoding a Gammaproteobacteria protein may be increased by substituting the codons of the corresponding Gammaproteobacteria gene with codons that are more prevalent in the particular species of yeast. A codon optimized gene thereby has a nucleotide sequence that varies from a naturally-occurring gene.

The term “constitutive promoter” refers to a promoter that mediates the transcription of an operably linked gene independent of a particular stimulus (e.g., independent of the presence of a reagent such as isopropyl β-D-1-thiogalactopyranoside).

The term “DGAT1” refers to a gene that encodes a type 1 diacylglycerol acyltransferase protein, such as a gene that encodes a yeast DGAT2 protein.

The term “DGAT2” refers to a gene that encodes a type 2 diacylglycerol acyltransferase protein, such as a gene that encodes a yeast DGA1 protein.

“Diacylglyceride,” “diacylglycerol,” and “diglyceride,” are esters comprised of glycerol and two fatty acids.

The terms “diacylglycerol acyltransferase” and “DGA” refer to any protein that catalyzes the formation of triacylglycerides from diacylglycerol. Diacylglycerol acyltransferases include type 1 diacylglycerol acyltransferases (DGA2), type 2 diacylglycerol acyltransferases (DGA1), and type 3 diacylglycerol acyltransferases (DGA3) and all homologs that catalyze the above-mentioned reaction.

The terms “diacylglycerol acyltransferase, type 1” and “type 1 diacylglycerol acyltransferases” refer to DGA2 and DGA2 orthologs.

The terms “diacylglycerol acyltransferase, type 2” and “type 2 diacylglycerol acyltransferases” refer to DGA1 and DGA1 orthologs.

The term “domain” refers to a part of the amino acid sequence of a protein that is able to fold into a stable three-dimensional structure independent of the rest of the protein.

The term “drug” refers to any molecule that inhibits cell growth or proliferation, thereby providing a selective advantage to cells that contain a gene that confers resistance to the drug. Drugs include antibiotics, antimicrobials, toxins, and pesticides.

“Dry weight” and “dry cell weight” mean weight determined in the relative absence of water. For example, reference to oleaginous cells as comprising a specified percentage of a particular component by dry cell weight means that the percentage is calculated based on the weight of the cell after substantially all water has been removed. The term “% dry weight,” when referring to a specific fatty acid (e.g., oleic acid or 10-methylstearic acid), includes fatty acids that are present as carboxylates, esters, thioesters, and amides. For example, a cell that comprises 10-methylstearic acid as a percentage of total fatty acids by % dry cell weight includes 10-methylstearic acid, 10-methylstearate, the 10-methylstearate portion of a diacylglycerol comprising a 10-methylstearate ester, the 10-methylstearate portion of a triacylglycerol comprising a 10-methylstearate ester, the 10-methylstearate portion of a phospholipid comprising a 10-methylstearate ester, and the 10-methylstearate portion of 10-methylstearate CoA. The term “% dry weight,” when referring to a specific type of fatty acid (e.g., C16 fatty acids, C18 fatty acids), includes fatty acids that are present as carboxylates, esters, thioesters, and amides as described above (e.g., for 10 methylstearic acid).

The term “gene,” as used herein, may encompass genomic sequences that contain exons, particularly polynucleotide sequences encoding polypeptide sequences involved in a specific activity. The term further encompasses synthetic nucleic acids that did not derive from genomic sequence. In certain embodiments, the genes lack introns, as they are synthesized based on the known DNA sequence of cDNA and protein sequence. In other embodiments, the genes are synthesized, non-native cDNA wherein the codons have been optimized for expression in Y. lipolytica or A. adeninivorans based on codon usage. The term can further include nucleic acid molecules comprising upstream, downstream, and/or intron nucleotide sequences.

The term “inducible promoter” refers to a promoter that mediates the transcription of an operably linked gene in response to a particular stimulus.

The term “integrated” refers to a nucleic acid that is maintained in a cell as an insertion into the cell's genome, such as insertion into a chromosome, including insertions into a plastid genome.

“In operable linkage” and “operably linked” refer to a functional linkage between two nucleic acid sequences, such as a control sequence (typically a promoter) and the linked sequence (typically a sequence that encodes a protein, also called a coding sequence). A promoter is in operable linkage with a gene or is operably linked to a gene if it can mediate transcription of the gene.

The term “nucleic acid” refers to a polymeric form of nucleotides of any length, either deoxyribonucleotides or ribonucleotides, or analogs thereof. Polynucleotides may have any three-dimensional structure, and may perform any function. The following are non-limiting examples of polynucleotides: coding or non-coding regions of a gene or gene fragment, loci (locus) defined from linkage analysis, exons, introns, messenger RNA (mRNA), transfer RNA, ribosomal RNA, ribozymes, cDNA, recombinant polynucleotides, branched polynucleotides, plasmids, vectors, isolated DNA of any sequence, isolated RNA of any sequence, nucleic acid probes, and primers. A polynucleotide may comprise modified nucleotides, such as methylated nucleotides and nucleotide analogs. If present, modifications to the nucleotide structure may be imparted before or after assembly of the polymer. A polynucleotide may be further modified, such as by conjugation with a labeling component. In all nucleic acid sequences provided herein, U nucleotides are interchangeable with T nucleotides.

The term “phospholipid” refers to esters comprising glycerol, two fatty acids, and a phosphate. The phosphate may be covalently linked to carbon-3 of the glycerol and comprise no further substitution, i.e., the phospholipid may be a phosphatidic acid. The phosphate may be substituted with ethanolamine (e.g., phosphatidylethanolamine), choline (e.g., phosphatidylcholine), serine (e.g., phosphatidylserine), inositol (e.g., phosphatidylinositol), inositol phosphate (e.g., phosphatidylinositol-3-phosphate, phosphatidylinositol-4-phosphate, phosphatidylinositol-5-phosphate), inositol bisphosphate (e.g., phosphatidylinositol-4,5-bisphosphate), or inositol triphosphate (e.g., phosphatidylinositol-3,4,5-bisphosphate).

As used herein, the term “plasmid” refers to a circular DNA molecule that is physically separate from an organism's genomic DNA. Plasmids may be linearized before being introduced into a host cell (referred to herein as a linearized plasmid). Linearized plasmids may not be self-replicating, but may integrate into and be replicated with the genomic DNA of an organism.

A “promoter” is a nucleic acid control sequence that directs the transcription of a nucleic acid. As used herein, a promoter includes the necessary nucleic acid sequences near the start site of transcription.

The term “protein” refers to molecules that comprise an amino acid sequence, wherein the amino acids are linked by peptide bonds.

“Transformation” refers to the transfer of a nucleic acid into a host organism or into the genome of a host organism, resulting in genetically stable inheritance. Host organisms containing the transformed nucleic acid are referred to as “recombinant,” “transgenic,” or “transformed” organisms. Thus, nucleic acids of the present invention can be incorporated into recombinant constructs, typically DNA constructs, capable of introduction into and replication in a host cell. Such a construct can be a vector that includes a replication system and sequences that are capable of transcription and translation of a polypeptide-encoding sequence in a given host cell. Typically, expression vectors include, for example, one or more cloned genes under the transcriptional control of 5′ and 3′ regulatory sequences and a selectable marker. Such vectors also can contain a promoter regulatory region (e.g., a regulatory region controlling inducible or constitutive, environmentally- or developmentally-regulated, or location-specific expression), a transcription initiation start site, a ribosome binding site, a transcription termination site, and/or a polyadenylation signal.

The term “transformed cell” refers to a cell that has undergone a transformation. Thus, a transformed cell comprises the parent's genome and an inheritable genetic modification.

The terms “triacylglyceride,” “triacylglycerol,” “triglyceride,” and “TAG” are esters comprised of glycerol and three fatty acids.

The term “recombinant gene” refers to a gene that (1) is operatively linked to a polynucleotide to which it is not linked in nature or (2) has a nucleotide sequence different from the naturally-occurring nucleotide sequence, such as, for example, a non-naturally occurring mutation, a codon-optimized sequence, or a cDNA that lacks naturally-occurring introns that are found at the gene's genomic locus. The term “recombinant” can be used in reference to cloned DNA isolates, chemically synthesized polynucleotide analogs, or polynucleotide analogs that are biologically synthesized by heterologous systems, as well as proteins and/or mRNAs encoded by such nucleic acids. Thus, for example, a protein synthesized by a microorganism is recombinant, if it is synthesized from an mRNA that is synthesized from a recombinant gene present in the cell. As other examples, a gene may be a recombinant gene if it is operably linked to a promoter different from the promoter to which it is operably linked in nature or if it is connected to another gene or portion thereof and, together with the other gene or portion thereof, encodes a protein that is not found in nature, such as a fusion protein or an epitope-tagged protein.

B. Microbe Engineering

1. Overview

Genes and gene products may be introduced into microbial host cells. Suitable host cells for expression of the genes and nucleic acid molecules are microbial hosts that can be found broadly within the fungal or bacterial families. Examples of suitable host strains include but are not limited to fungal or yeast species, such as Arxula, Aspegillus, Aurantiochytrium, Candida, Claviceps, Cryptococcus, Cunninghamella, Hansenula, Kluyveromyces, Leucosporidiella, Lipomyces, Mortierella, Ogataea, Pichia, Prototheca, Rhizopus, Rhodosporidium, Rhodotorula, Saccharomyces, Schizosaccharomyces, Tremella, Trichosporon, Yarrowia, or bacterial species, such as members of proteobacteria and actinomycetes, as well as the genera Acinetobacter, Arthrobacter, Brevibacterium, Acidovorax, Bacillus, Clostridia, Streptomyces, Escherichia, Salmonella, Pseudomonas, and Cornyebacterium. Yarrowia lipolytica and Arxula adeninivorans are suited for use as a host microorganism because they can accumulate a large percentage of their weight as triacylglycerols.

Microbial expression systems and expression vectors containing regulatory sequences that direct high level expression of foreign proteins are known to those skilled in the art. Any of these could be used to construct chimeric genes to produce any one of the gene products of the instant sequences. These chimeric genes could then be introduced into appropriate microorganisms via transformation techniques to provide high-level expression of the enzymes.

For example, a gene encoding an enzyme can be cloned in a suitable plasmid, and an aforementioned starting parent strain as a host can be transformed with the resulting plasmid. This approach can increase the copy number of each of the genes encoding the enzymes and, as a result, the activities of the enzymes can be increased. The plasmid is not particularly limited so long as it renders a desired genetic modification inheritable to the microorganism's progeny.

Vectors or cassettes useful for the transformation of suitable host cells are well known. Typically the vector or cassette contains sequences that direct the transcription and translation of the relevant gene, a selectable marker, and sequences that allow autonomous replication or chromosomal integration. Suitable vectors comprise a region 5′ of the gene harboring transcriptional initiation controls and a region 3′ of the DNA fragment which controls transcriptional termination. In certain embodiments both control regions are derived from genes homologous to the transformed host cell, although it is to be understood that such control regions need not be derived from the genes native to the specific species chosen as a production host.

Promoters, cDNAs, and 3′ UTRs, as well as other elements of the vectors, can be generated through cloning techniques using fragments isolated from native sources (see, e.g., Green & Sambrook, Molecular Cloning: A Laboratory Manual, (4th ed., 2012); U.S. Pat. No. 4,683,202 (incorporated by reference)). Alternatively, elements can be generated synthetically using known methods (see, e.g., Gene 164:49-53 (1995)).

2. Homologous Recombination

Homologous recombination is the ability of complementary DNA sequences to align and exchange regions of homology. Transgenic DNA (“donor”) containing sequences homologous to the genomic sequences being targeted (“template”) is introduced into the organism and then undergoes recombination into the genome at the site of the corresponding homologous genomic sequences.

The ability to carry out homologous recombination in a host organism has many practical implications for what can be carried out at the molecular genetic level and is useful in the generation of a microbe that can produce a desired product. By its nature homologous recombination is a precise gene targeting event and, hence, most transgenic lines generated with the same targeting sequence will be essentially identical in terms of phenotype, necessitating the screening of far fewer transformation events. Homologous recombination also targets gene insertion events into the host chromosome, potentially resulting in excellent genetic stability, even in the absence of genetic selection. Because different chromosomal loci will likely impact gene expression, even from exogenous promoters/UTRs, homologous recombination can be a method of querying loci in an unfamiliar genome environment and to assess the impact of these environments on gene expression.

A particularly useful genetic engineering approach using homologous recombination is to co-opt specific host regulatory elements, such as promoters/UTRs, to drive heterologous gene expression in a highly specific fashion.

Because homologous recombination is a precise gene targeting event, it can be used to precisely modify any nucleotide(s) within a gene or region of interest, so long as sufficient flanking regions have been identified. Therefore, homologous recombination can be used as a means to modify regulatory sequences impacting gene expression of RNA and/or proteins. It can also be used to modify protein coding regions in an effort to modify enzyme activities such as substrate specificity, affinities and Km, thereby affecting a desired change in the metabolism of the host cell. Homologous recombination provides a powerful means to manipulate the host genome resulting in gene targeting, gene conversion, gene deletion, gene duplication, gene inversion, and exchanging gene expression regulatory elements such as promoters, enhancers and 3′ UTRs.

Homologous recombination can be achieved by using targeting constructs containing pieces of endogenous sequences to “target” the gene or region of interest within the endogenous host cell genome. Such targeting sequences can either be located 5′ of the gene or region of interest, 3′ of the gene/region of interest or even flank the gene/region of interest. Such targeting constructs can be transformed into the host cell either as a supercoiled plasmid DNA with additional vector backbone, a PCR product with no vector backbone, or as a linearized molecule. In some cases, it may be advantageous to first expose the homologous sequences within the transgenic DNA (donor DNA) by cutting the transgenic DNA with a restriction enzyme. This step can increase the recombination efficiency and decrease the occurrence of undesired events. Other methods of increasing recombination efficiency include using PCR to generate transforming transgenic DNA containing linear ends homologous to the genomic sequences being targeted.

3. Vectors and Vector Components

Vectors for transforming microorganisms in accordance with the present invention can be prepared by known techniques familiar to those skilled in the art in view of the disclosure herein. A vector typically contains one or more genes, in which each gene codes for the expression of a desired product (the gene product) and is operably linked to one or more control sequences that regulate gene expression or target the gene product to a particular location in the recombinant cell.

a. Control Sequences

Control sequences are nucleic acids that regulate the expression of a coding sequence or direct a gene product to a particular location in or outside a cell. Control sequences that regulate expression include, for example, promoters that regulate transcription of a coding sequence and terminators that terminate transcription of a coding sequence. Another control sequence is a 3′ untranslated sequence located at the end of a coding sequence that encodes a polyadenylation signal. Control sequences that direct gene products to particular locations include those that encode signal peptides, which direct the protein to which they are attached to a particular location inside or outside the cell.

Thus, an exemplary vector design for expression of a gene in a microbe contains a coding sequence for a desired gene product (for example, a selectable marker, or an enzyme) in operable linkage with a promoter active in yeast. Alternatively, if the vector does not contain a promoter in operable linkage with the coding sequence of interest, the coding sequence can be transformed into the cells such that it becomes operably linked to an endogenous promoter at the point of vector integration. The promoter used to express a gene can be the promoter naturally linked to that gene or a different promoter.

A promoter can generally be characterized as constitutive or inducible. Constitutive promoters are generally active or function to drive expression at all times (or at certain times in the cell life cycle) at the same level. Inducible promoters, conversely, are active (or rendered inactive) or are significantly up- or down-regulated only in response to a stimulus. Both types of promoters find application in the methods of the invention. Inducible promoters useful in the invention include those that mediate transcription of an operably linked gene in response to a stimulus, such as an exogenously provided small molecule, temperature (heat or cold), lack of nitrogen in culture media, etc. Suitable promoters can activate transcription of an essentially silent gene or upregulate, e.g., substantially, transcription of an operably linked gene that is transcribed at a low level.

Inclusion of termination region control sequence is optional, and if employed, then the choice is primarily one of convenience, as the termination region is relatively interchangeable. The termination region may be native to the transcriptional initiation region (the promoter), may be native to the DNA sequence of interest, or may be obtainable from another source (See, e.g., Chen & Orozco, Nucleic Acids Research 16:8411 (1988)).

b. Genes and Codon Optimization

Typically, a gene includes a promoter, a coding sequence, and termination control sequences. When assembled by recombinant DNA technology, a gene may be termed an expression cassette and may be flanked by restriction sites for convenient insertion into a vector that is used to introduce the recombinant gene into a host cell. The expression cassette can be flanked by DNA sequences from the genome or other nucleic acid target to facilitate stable integration of the expression cassette into the genome by homologous recombination. Alternatively, the vector and its expression cassette may remain unintegrated (e.g., an episome), in which case, the vector typically includes an origin of replication, which is capable of providing for replication of the vector DNA.

A common gene present on a vector is a gene that codes for a protein, the expression of which allows the recombinant cell containing the protein to be differentiated from cells that do not express the protein. Such a gene, and its corresponding gene product, is called a selectable marker or selection marker. Any of a wide variety of selectable markers can be employed in a transgene construct useful for transforming the organisms of the invention.

For optimal expression of a recombinant protein, it is beneficial to employ coding sequences that produce mRNA with codons optimally used by the host cell to be transformed. Thus, proper expression of transgenes can require that the codon usage of the transgene matches the specific codon bias of the organism in which the transgene is being expressed. The precise mechanisms underlying this effect are many, but include the proper balancing of available aminoacylated tRNA pools with proteins being synthesized in the cell, coupled with more efficient translation of the transgenic messenger RNA (mRNA) when this need is met. When codon usage in the transgene is not optimized, available tRNA pools are not sufficient to allow for efficient translation of the transgenic mRNA resulting in ribosomal stalling and termination and possible instability of the transgenic mRNA. Resources for codon-optimization of gene sequences are described in Puigbo et al., Nucleic Acids Research 35:W126-31 (2007), and principles underlying codon optimization strategies are described in Angov, Biotechnology Journal 6:650-69 (2011). Public databases providing statistics for codon usage by different organisms are available, including at www.kazusa.or.jp/codon/ and other publicly available databases and resources.

4. Transformation

Cells can be transformed by any suitable technique including, e.g., biolistics, electroporation, glass bead transformation, and silicon carbide whisker transformation. Any convenient technique for introducing a transgene into a microorganism can be employed in the present invention. Transformation can be achieved by, for example, the method of D. M. Morrison (Methods in Enzymology 68:326 (1979)), the method by increasing permeability of recipient cells for DNA with calcium chloride (Mandel & Higa, J. Molecular Biology, 53:159 (1970)), or the like.

Examples of expression of transgenes in oleaginous yeast (e.g., Yarrowia lipolytica) can be found in the literature (Bordes et al., J. Microbiological Methods, 70:493 (2007); Chen et al., Applied Microbiology & Biotechnology 48:232 (1997)). Examples of expression of exogenous genes in bacteria such as E. coli are well known (Green & Sambrook, Molecular Cloning: A Laboratory Manual, (4th ed., 2012)).

Vectors for transformation of microorganisms in accordance with the present invention can be prepared by known techniques familiar to those skilled in the art. In one embodiment, an exemplary vector design for expression of a gene in a microorganism contains a gene encoding an enzyme in operable linkage with a promoter active in the microorganism. Alternatively, if the vector does not contain a promoter in operable linkage with the gene of interest, the gene can be transformed into the cells such that it becomes operably linked to a native promoter at the point of vector integration. The vector can also contain a second gene that encodes a protein. Optionally, one or both gene(s) is/are followed by a 3′ untranslated sequence containing a polyadenylation signal. Expression cassettes encoding the two genes can be physically linked in the vector or on separate vectors. Co-transformation of microbes can also be used, in which distinct vector molecules are simultaneously used to transform cells (Protist 155:381-93 (2004)). The transformed cells can be optionally selected based upon the ability to grow in the presence of the antibiotic or other selectable marker under conditions in which cells lacking the resistance cassette would not grow.

C. Exemplary Cells, Nucleic Acids, Compositions, and Methods

1. Transformed Cells

In some aspects, embodiments of the invention include cells transformed with one or more nucleic acids encoding a methyltransferase and/or reductase protein. In some embodiments, the transformed cell is a prokaryotic cell, such as a bacterial cell. In some embodiments, the cell is a eukaryotic cell, such as a mammalian cell, a yeast cell, a filamentous fungi cell, a protist cell, an algae cell, an avian cell, a plant cell, or an insect cell. In some embodiments, the cell is a yeast. Those with skill in the art will recognize that many forms of filamentous fungi produce yeast-like growth, and the definition of yeast herein encompasses such cells. The cell may cell may be selected from the group consisting of algae, bacteria, molds, fungi, plants, and yeasts. The cell may be a yeast, fungus, or yeast-like algae. The cell may be selected from thraustochytrids (Aurantiochytrium) and achlorophylic unicellular algae (Prototheca).

The cell may be selected from the group consisting of Arxula, Aspegillus, Aurantiochytrium, Candida, Claviceps, Cryptococcus, Cunninghamella, Geotrichum, Hansenula, Kluyveromyces, Kodamaea, Leucosporidiella, Lipomyces, Mortierella, Ogataea, Pichia, Prototheca, Rhizopus, Rhodosporidium, Rhodotorula, Saccharomyces, Schizosaccharomyces, Tremella, Trichosporon, Wickerhamomyces, and Yarrowia. It is specifically contemplated that one or more of these cell types may be excluded from embodiments of this invention.

The cell may be selected from the group of consisting of Arxula adeninivorans, Aspergillus niger, Aspergillus orzyae, Aspergillus terreus, Aurantiochytrium limacinum, Candida utilis, Claviceps purpurea, Cryptococcus albidus, Cryptococcus curvatus, Cryptococcus ramirezgomezianus, Cryptococcus terreus, Cryptococcus wieringae, Cunninghamella echinulata, Cunninghamella japonica, Geotrichum fermentans, Hansenula polymorpha, Kluyveromyces lactis, Kluyveromyces marxianus, Kodamaea ohmeri, Leucosporidiella creatinivora, Lipomyces lipofer, Lipomyces starkeyi, Lipomyces tetrasporus, Mortierella isabellina, Mortierella alpina, Ogataea polymorpha, Pichia ciferrii, Pichia guilliermondii, Pichia pastoris, Pichia stipites, Prototheca zopfii, Rhizopus arrhizus, Rhodosporidium babjevae, Rhodosporidium toruloides, Rhodosporidium paludigenum, Rhodotorula glutinis, Rhodotorula mucilaginosa, Saccharomyces cerevisiae, Schizosaccharomyces pombe, Tremella enchepala, Trichosporon cutaneum, Trichosporon fermentans, Wickerhamomyces ciferrii, and Yarrowia lipolytica. It is specifically contemplated that one or more of these cell types may be excluded from embodiments of this invention.

In certain embodiments, the transformed cell comprises about, at least about, or at most about 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, or more lipid as measured by % dry cell weight, or any range derivable therein. In some embodiments, the transformed cell comprises C18 fatty acids at a concentration of about, at least about, or at most about 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, or 95% as a percentage of total C16 and C18 fatty acids in the cell by weight, or any range derivable therein.

In some embodiments, the transformed cell comprises oleic acid at a concentration of about, at least about, or at most about 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90% or higher as a percentage of total C16 and C18 fatty acids in the cell by weight, or any range derivable therein. In some embodiments, the transformed cell comprises 10-methylstearic acid at a concentration of about, at least about, or of at most about 1%, 2%, 3%, 4%, 5%, 6%, 7%, 8%, 9%, 10%, 11%, 12%, 13%, 14%, 15%, 16%, 17%, 18%, 19%, 20%, 21%, 22%, 23%, 24%, 25%, 26%, 27%, 28%, 29%, 30%, 31%, 32%, 33%, 34%, 35%, 36%, 37%, 38%, 39%, 40%, 41%, 42%, 43%, 44%, 45%, 46%, 47%, 48%, 49%, 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, or higher as a percentage of total fatty acids in the cell by weight, or any range derivable therein.

A cell may be modified to increase its oleate content, which serves as a substrate for 10-methylstearate synthesis. Genetic modifications that increase oleate content are known (see, e.g., PCT Patent Application Publication No. WO16/094520, published Jun. 16, 2016, hereby incorporated by reference in its entirety). For example, a cell may comprise a Δ12 desaturase knockdown or knockout, which favors the accumulation of oleate and disfavors the production of linoleate. A cell may comprise a recombinant Δ9 desaturase gene, which favors the production of oleate and disfavors the accumulation of stearate. The recombinant Δ9 desaturase gene may be, for example, the Δ9 desaturase gene from Y. lipolytica, Arxula adeninivorans, or Puccinia graminis. A cell may comprise a recombinant elongase 1 gene, which favors the production of oleate and disfavors the accumulation of palmitate and palmitoleate. The recombinant elongase 1 gene may be the elongase 1 gene from Y. lipolytica. A cell may comprise a recombinant elongase 2 gene, which favors the production of oleate and disfavors the accumulation of palmitate and palmitoleate. The recombinant elongase 2 gene may be the elongase 2 gene from R. norvegicus.

A cell may be modified to increase its triacylglycerol content, thereby increasing its 10-methylstearate content. Genetic modifications that increase triacylglycerol content are known (see, e.g., PCT Patent Application Publication No. WO16/094520, published Jun. 16, 2016, hereby incorporated by reference in its entirety). A cell may comprise a recombinant diacylglycerol acyltransferase gene (e.g., DGAT1, DGAT2, or DGAT3), which favors the production of triacylglycerols and disfavors the accumulation of diacylglycerols. The recombinant diacylglycerol acyltransferase gene may be, for example, DGAT2 (encoding protein DGA1) from Y. lipolytica, DGAT1 (encoding protein DGAT2) from C. purpurea, or DGAT2 (encoding protein DGA1) from R. toruloides. The cell may comprise a glycerol-3-phosphate acyltransferase gene (Sct1) knockdown or knockout, which may favor the accumulation of triacylglycerols, depending on the cell type. The cell may comprise a recombinant glycerol-3-phosphate acyltransferase gene (Sct1) such as the Sct1 gene from A. adeninivorans, which may favor the accumulation of triacylglycerols. The cell may comprise a triacylglycerol lipase gene (TGL) knockdown or knockout, which may favor the accumulation of triacylglycerols in the cell.

Various aspects of the invention relate to a transformed cell. The transformed cell may comprise a recombinant methyltransferase gene (e.g., a tmpB gene), a recombinant reductase gene (e.g., a tmpA gene), an exomethylene-substituted lipid, and/or a branched (methyl)lipid. A branched (methyl)lipid may be a carboxylic acid (e.g., 10-methylstearic acid, 10-methylpalmitic acid, 12-methyloleic acid, 13-methyloleic acid, 10-methyl-octadec-12-enoic acid), carboxylate (e.g., 10-methylstearate, 10-methylpalmitate, 12-methyloleate, 13-methyloleate, 10-methyl-octadec-12-enoate), ester (e.g., diacylglycerol, triacylglycerol, phospholipid), thioester (e.g., 10-methylstearyl CoA, 10-methylpalmityl CoA, 12-methyloleoyl CoA, 13-methyloleoyl CoA, 10-methyl-octadec-12-enoyl CoA), or amide. An exomethylene-substituted lipid may be a carboxylic acid (e.g., 10-methylenestearic acid, 10-methylenepalmitic acid, 12-methyleneoleic acid, 13-methyleneoleic acid, 10-methylene-octadec-12-enoic acid), carboxylate (e.g., 10-methylenestearate, 10-methylenepalmitate, 12-methyleneoleate, 13-methyleneoleate, 10-methylene-octadec-12-enoate), ester (e.g., diacylglycerol, triacylglycerol, phospholipid), thioester (e.g., 10-methylenestearyl CoA, 10-methylenepalmityl CoA, 12-methyleneoleoyl CoA, 13-methyleneoleoyl CoA, 10-methylene-octadec-12-enoyl CoA), or amide. It is specifically contemplated that one or more of the above lipids may be excluded from embodiments of this invention. The methyltransferase gene and reductase gene may have the capability of together producing a methylated branch from any fatty acid from 14 to 18 carbons long with an unsaturated double bond in the Δ9, Δ10, or Δ11 position. The fatty acid may be 14, 15, 16, 17, or 18 carbons, or any range derivable therein.

“Fatty acids” generally exist in a cell as a phospholipid or triacylglycerol, although they may also exist as a monoacylglycerol or diacylglycerol, for example, as a metabolic intermediate. Free fatty acids also exist in the cell in equilibrium between a relatively abundant carboxylate anion and a relatively scarce, neutrally-charged acid. A fatty acid may exist in a cell as a thioester, especially as a thioester with coenzyme A (CoA), during biosynthesis or oxidation. A fatty acid may exist in a cell as an amide, for example, when covalently bound to a protein to anchor the protein to a membrane.

A cell may comprise any one of the nucleic acids described herein, infra (see, e.g., Section B, below). A cell may comprise multiple copies of any one of the nucleic acids described herein. This can be accomplished by, for example, including a tmpB and/or tmpB gene on a high-copy-number plasmid that is transformed into a cell.

A branched (methyl)lipid may comprise a saturated branched aliphatic chain (e.g., 10-methylstearic acid, 10-methylpalmitic acid) or an unsaturated branched aliphatic chain (e.g., 12-methyloleic acid, 13-methyloleic acid, 10-methyl-octadec-12-enoic acid). The branched (methyl)lipid may comprise a saturated or unsaturated branched aliphatic chain comprising a branching methyl group.

An exomethylene-substituted lipid may comprise a branched aliphatic chain (e.g., 10-methylenestearic acid, 10-methylenepalmitic acid, 12-methyleneoleic acid, 13-methyleneoleic acid, 10-methylene-octadec-12-enoic acid). The aliphatic chain may be branched because the aliphatic chain is substituted with an exomethylene group.

A branched (methyl)lipid may be 10-methylstearate, or an acid (10-methylstearic acid), ester (e.g., diacylglycerol, triacylglycerol, phospholipid), thioester (e.g., 10-methylstearyl CoA), or amide (e.g., 10-methylstearyl amide) thereof. For example, the branched (methyl)lipid may be a diacylglycerol, triacylglycerol, or phospholipid, and the diacylglycerol, triacylglycerol, or phospholipid may comprise an ester of 10-methylstearate.

An exomethylene-substituted lipid may be 10-methylenestearate, or an acid (10-methylenestearic acid), ester (e.g., diacylglycerol, triacylglycerol, phospholipid), thioester (e.g., 10-methylenestearyl CoA), or amide (e.g., 10-methylenestearyl amide) thereof. For example, the exomethylene-substituted lipid may be a diacylglycerol, triacylglycerol, or phospholipid, and the diacylglycerol, triacylglycerol, or phospholipid may comprise an ester of 10-methylenestearate.

In some embodiments, about, at least about, or at most about 1% of the fatty acids of the cell may be 10-methylstearic acid by weight. About, at least about, or at most about 2%, 3%, 4%, 5%, 6%, 7%, 8%, 9%, 10%, 11%, 12%, 13%, 14%, 15%, 16%, 17%, 18%, 19%, 20%, 21%, 22%, 23%, 24%, 25%, 26%, 27%, 28%, 29%, 30%, 31%, 32%, 33%, 34%, 35%, 36%, 37%, 38%, 39%, 40%, 41%, 42%, 43%, 44%, 45%, 46%, 47%, 48%, 49%, 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% by weight of the fatty acids of the cell may be 10-methylstearic acid, or any range derivable therein.

In some embodiments, about, at least about, or at most about 1% of the fatty acids of the cell may be 10-methylenestearic acid by weight. About, at least about, or at most about 2%, 3%, 4%, 5%, 6%, 7%, 8%, 9%, 10%, 11%, 12%, 13%, 14%, 15%, 16%, 17%, 18%, 19%, 20%, 21%, 22%, 23%, 24%, 25%, 26%, 27%, 28%, 29%, 30%, 31%, 32%, 33%, 34%, 35%, 36%, 37%, 38%, 39%, 40%, 41%, 42%, 43%, 44%, 45%, 46%, 47%, 48%, 49%, 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% by weight of the fatty acids of the cell may be 10-methylenestearic acid, or any range derivable therein.

In some embodiments, about, at least about, or at most about 1% by weight of the fatty acids of the cell may be one or more of the branched (methyl)lipids described herein. About, at least about, or at most about 2%, 3%, 4%, 5%, 6%, 7%, 8%, 9%, 10%, 11%, 12%, 13%, 14%, 15%, 16%, 17%, 18%, 19%, 20%, 21%, 22%, 23%, 24%, 25%, 26%, 27%, 28%, 29%, 30%, 31%, 32%, 33%, 34%, 35%, 36%, 37%, 38%, 39%, 40%, 41%, 42%, 43%, 44%, 45%, 46%, 47%, 48%, 49%, 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% by weight of the fatty acids of the cell may be one or more of the branched (methyl)lipids described herein, or any range derivable therein.

In some embodiments, about, at least about, or at most about 1% by weight of the fatty acids of the cell may be one or more of the branched (methyl)lipids described herein. About, at least about, or at most about 2%, 3%, 4%, 5%, 6%, 7%, 8%, 9%, 10%, 11%, 12%, 13%, 14%, 15%, 16%, 17%, 18%, 19%, 20%, 21%, 22%, 23%, 24%, 25%, 26%, 27%, 28%, 29%, 30%, 31%, 32%, 33%, 34%, 35%, 36%, 37%, 38%, 39%, 40%, 41%, 42%, 43%, 44%, 45%, 46%, 47%, 48%, 49%, 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% by weight of the fatty acids of the cell may one or more of the branched (methyl)lipids described herein, or any range derivable therein.

In some embodiments, the cell may comprise about, at least about, or at most about 1% 10-methylstearic acid as measured by % dry cell weight. The cell may comprise about, at least about, or at most about 2%, 3%, 4%, 5%, 6%, 7%, 8%, 9%, 10%, 11%, 12%, 13%, 14%, 15%, 16%, 17%, 18%, 19%, 20%, 21%, 22%, 23%, 24%, 25%, 26%, 27%, 28%, 29%, 30%, 31%, 32%, 33%, 34%, 35%, 36%, 37%, 38%, 39%, 40%, 41%, 42%, 43%, 44%, 45%, 46%, 47%, 48%, 49%, or 50% 10-methylstearic acid as measured by % dry cell weight, or any range derivable therein.

In some embodiments, the cell may comprise about, at least about, or at most about 1% 10-methylenestearic acid as measured by % dry cell weight. The cell may comprise about, at least about, or at most about 2%, 3%, 4%, 5%, 6%, 7%, 8%, 9%, 10%, 11%, 12%, 13%, 14%, 15%, 16%, 17%, 18%, 19%, 20%, 21%, 22%, 23%, 24%, 25%, 26%, 27%, 28%, 29%, 30%, 31%, 32%, 33%, 34%, 35%, 36%, 37%, 38%, 39%, 40%, 41%, 42%, 43%, 44%, 45%, 46%, 47%, 48%, 49%, or 50% 10-methylenestearic acid as measured by % dry cell weight, or any range derivable therein.

An unmodified cell of the same type (e.g., species) as a cell of the invention may not comprise 10-methylstearate, or an acid (10-methylstearic acid), ester (e.g., diacylglycerol, triacylglycerol, phospholipid), thioester (e.g., 10-methylstearyl CoA), or amide (e.g., 10-methylstearyl amide) thereof (e.g., wherein the unmodified cell does not comprise a recombinant methyltransferase gene or a recombinant reductase gene). An unmodified cell of the same type (e.g., species) as a cell of the invention may not comprise 10-methylenestearate, or an acid (10-methylenestearic acid), ester (e.g., diacylglycerol, triacylglycerol, phospholipid), thioester (e.g., 10-methylenestearyl CoA), or amide (e.g., 10-methylenestearyl amide) thereof (e.g., wherein the unmodified cell does not comprise a recombinant methyltransferase gene or a recombinant reductase gene). In some embodiments, an unmodified cell of the same species as the cell does not comprise a branched (methyl)lipid and/or an exomethylene-substituted lipid. In some embodiments, an unmodified cell of the same species as the cell does not comprise one or more of the branched (methyl)lipids or exomethylene-substituted lipids described herein.

In some embodiments, a cell may constitutively express the protein encoded by a recombinant methyltransferase gene and/or reductase gene. A cell may constitutively express a methyltransferase protein and/or reductase protein.

2. Nucleic Acids

a. General

Various aspects of the invention relate to a nucleic acid comprising a recombinant methyltransferase gene, a recombinant reductase gene, or both. The nucleic acid may be, for example, a plasmid. In some embodiments, a recombinant methyltransferase gene and/or a recombinant reductase gene is integrated into the genome of a cell, and thus, the nucleic acid may be a chromosome. In some embodiments, the invention relates to a cell comprising a recombinant methyltransferase gene, e.g., wherein the recombinant methyltransferase gene is present in a plasmid or chromosome. In some embodiments, the invention relates to a cell comprising a recombinant reductase gene, e.g., wherein the recombinant reductase gene is present in a plasmid or chromosome. A recombinant methyltransferase gene and a recombinant reductase gene may be present in a cell in the same nucleic acid (e.g., same plasmid or chromosome) or in different nucleic acids (e.g., different plasmids or chromosomes).

A nucleic acid may be inheritable to the progeny of a transformed cell. A gene such as a recombinant methyltransferase gene or recombinant reductase gene may be inheritable because it resides on a plasmid or chromosome. In certain embodiments, a gene may be inheritable because it is integrated into the genome of the transformed cell.

A gene may comprise conservative substitutions, deletions, and/or insertions while still encoding a protein that has activity. For example, codons may be optimized for a particular host cell, different codons may be substituted for convenience, such as to introduce a restriction site or to create optimal PCR primers, or codons may be substituted for another purpose. Similarly, the nucleotide sequence may be altered to create conservative amino acid substitutions, deletions, and/or insertions.

Proteins may comprise conservative substitutions, deletions, and/or insertions while still maintaining activity. Conservative substitution tables are well known in the art (Creighton, Proteins (2d. ed., 1992)).

Amino acid substitutions, deletions and/or insertions may readily be made using recombinant DNA manipulation techniques. Methods for the manipulation of DNA sequences to produce substitution, insertion or deletion variants of a protein are well known in the art. These methods include M13 mutagenesis, T7-Gen in vitro mutagenesis (USB, Cleveland, Ohio), Quick Change Site Directed mutagenesis (Stratagene, San Diego, Calif.), PCR-mediated site-directed mutagenesis, and other site-directed mutagenesis protocols.

A “coding sequence” or “coding region” refers to a nucleic acid molecule having sequence information necessary to produce a protein product, such as an amino acid or polypeptide, when the sequence is expressed. The coding sequence may comprise and/or consist of untranslated sequences (including introns or 5′ or 3′ untranslated regions) within translated regions, or may lack such intervening untranslated sequences (e.g., as in cDNA).

The abbreviation used throughout the specification to refer to nucleic acids comprising and/or consisting of nucleotide sequences are the conventional one-letter abbreviations. Thus, when included in a nucleic acid, the naturally occurring encoding nucleotides are abbreviated as follows: adenine (A), guanine (G), cytosine (C), thymine (T) and uracil (U). Also, unless otherwise specified, the nucleic acid sequences presented herein is the 5′→3′ direction.

b. Nucleic Acids Comprising a Recombinant Methyltransferase Gene

A methyltransferase gene (e.g., a recombinant methyltransferase gene) encodes a methyltransferase protein, which is an enzyme capable of transferring a carbon atom and one or more protons bound thereto from a substrate such as S-adenosyl methionine to a fatty acid such as oleic acid (e.g., wherein the fatty acid is present as a free fatty acid, carboxylate, phospholipid, diacylglycerol, or triacylglycerol). The methyltransferase gene (e.g., a recombinant methyltransferase gene) may have a coding region that is identical to one from a bacterium of the class Gammaproteobacteria. The methyltransferase gene may comprise any one of the nucleotide sequences set forth in SEQ ID NO:1, SEQ ID NO:3, SEQ ID NO:5, SEQ ID NO:7, SEQ ID NO:9, SEQ ID NO:11, SEQ ID NO:13, SEQ ID NO:15, and SEQ ID NO:17. The methyltransferase gene (e.g., a recombinant methyltransferase gene) may be a 10-methylstearic B gene (tmpB) as described herein, or a biologically-active portion thereof (i.e., wherein the biologically-active portion thereof comprises methyltransferase activity).

The methyltransferase gene (e.g., a recombinant methyltransferase gene) may be derived from a species of Gammaproteobacteria, such as bacteria from the genera Desulfobacter, Desulfobacula, Marinobacter, Thiohalospira, or Halofilum. The methyltransferase gene (e.g., a recombinant methyltransferase gene) may be selected from the group consisting of Desulfobacula balticum gene tmpB (SEQ ID NO:1), Marinobacter hydrocarbonclasticus gene tmpB (SEQ ID NO:3), Thiohalospira halophila gene tmpB (SEQ ID NO:5), Desulfobacter curvatus gene tmpB (SEQ ID NO:7), Desulfobacter phenolica gene tmpB (SEQ ID NO:9), Desulfobacula toluolica gene tmpB (SEQ ID NO:11), Desulfobacter postgatei gene tmpB (SEQ ID NO:13), Halofilum ochraceum gene tmpB (SEQ ID NO:15), and Marinobacter aquaeolei gene tmpB (SEQ ID NO:17). It is specifically contemplated that one or more of the above methyltransferase genes may be excluded from embodiments of this invention.

A recombinant methyltransferase gene may be recombinant because it is operably linked to a promoter other than the naturally-occurring promoter of the methyltransferase gene. Such genes may be useful to drive transcription in a particular species of cell. A recombinant methyltransferase gene may be recombinant because it contains one or more nucleotide substitutions relative to a naturally-occurring methyltransferase gene. Such genes may be useful to increase the translation efficiency of the methyltransferase gene's mRNA transcript in a particular species of cell.

A nucleic acid may comprise a recombinant methyltransferase gene and a promoter, wherein the recombinant methyltransferase gene and promoter are operably linked. The recombinant methyltransferase gene and promoter may be derived from different species. For example, the recombinant methyltransferase gene may encode the methyltransferase protein of a species of Gammaproteobacteria, and the recombinant methyltransferase gene may be operably-linked to a promoter that can drive transcription in another type of bacteria or a eukaryote (e.g., an algae cell, yeast cell, or plant cell). The promoter may be a eukaryotic promoter. A cell may comprise the nucleic acid, and the promoter may be capable of driving transcription in the cell. A cell may comprise a recombinant methyltransferase gene, and the recombinant methyltransferase gene may be operably linked to a promoter capable of driving transcription of the recombinant methyltransferase gene in the cell. The cell may be a species of yeast, and the promoter may be a yeast promoter. The cell may be a species of bacteria, and the promoter may be a bacterial promoter (e.g., wherein the bacterial promoter is not a promoter from a Gammaproteobacterium). The cell may be a species of algae, and the promoter may be an algae promoter. The cell may be a species of plant, and the promoter may be a plant promoter.

A recombinant methyltransferase gene may be operably linked to a promoter that cannot drive transcription in the cell from which the recombinant methyltransferase gene originated. For example, the promoter may not be capable of binding an RNA polymerase of the cell from which a recombinant methyltransferase gene originated. In some embodiments, the promoter cannot bind a prokaryotic RNA polymerase and/or initiate transcription mediated by a prokaryotic RNA polymerase. In some embodiments, a recombinant methyltransferase gene is operably-linked to a promoter that cannot drive transcription in the cell from which the protein encoded by the gene originated. For example, the promoter may not be capable of binding an RNA polymerase of a cell that naturally expresses the methyltransferase enzyme encoded by a recombinant methyltransferase gene.

A promoter may be an inducible promoter or a constitutive promoter. A promoter may be any one of the promoters described in PCT Patent Application Publication No. WO 2016/014900, published Jan. 28, 2016 (hereby incorporated by reference in its entirety). WO 2016/014900 describes various promoters derived from yeast species Yarrowia lipolytica and Arxula adeninivorans, which may be particularly useful as promoters for driving the transcription of a recombinant gene in a yeast cell. A promoter may be a promoter from a gene encoding a Translation Elongation factor EF-1α; Glycerol-3-phosphate dehydrogenase; Triosephosphate isomerase 1; Fructose-1,6-bisphosphate aldolase; Phosphoglycerate mutase; Pyruvate kinase; Export protein EXP1; Ribosomal protein S7; Alcohol dehydrogenase; Phosphoglycerate kinase; Hexose Transporter; General amino acid permease; Serine protease; Isocitrate lyase; Acyl-CoA oxidase; ATP-sulfurylase; Hexokinase; 3-phosphoglycerate dehydrogenase; Pyruvate Dehydrogenase Alpha subunit; Pyruvate Dehydrogenase Beta subunit; Aconitase; Enolase; Actin; Multidrug resistance protein (ABC-transporter); Ubiquitin; GTPase; Plasma membrane Na+/P_icotransporter; Pyruvate decarboxylase; Phytase; or Alpha-amylase, e.g., wherein the gene is a yeast gene, such as a gene from Yarrowia lipolytica or Arxula adeninivorans.

A recombinant methyltransferase gene may comprise a nucleotide sequence with, with at least, or with at most 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% sequence identity with the nucleotide sequence set forth in SEQ ID NO:1, SEQ ID NO:3, SEQ ID NO:5, SEQ ID NO:7, SEQ ID NO:9, SEQ ID NO:11, SEQ ID NO:13, SEQ ID NO:15, or SEQ ID NO:17. A recombinant methyltransferase gene may comprise a nucleotide sequence with, with at least, or with at most 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% sequence identity with 150, 200, 250, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 1000, 1050, 1100, 1150, 1200, 1250, or 1300 contiguous base pairs of the nucleotide sequence set forth in SEQ ID NO:1, SEQ ID NO:3, SEQ ID NO:5, SEQ ID NO:7, SEQ ID NO:9, SEQ ID NO:11, SEQ ID NO:13, SEQ ID NO:15, or SEQ ID NO:17. A recombinant methyltransferase may or may not have 100% sequence identity with any one of the nucleotide sequences set forth in SEQ ID NO:1, SEQ ID NO:3, SEQ ID NO:5, SEQ ID NO:7, SEQ ID NO:9, SEQ ID NO:11, SEQ ID NO:13, SEQ ID NO:15, or SEQ ID NO:17. A recombinant methyltransferase gene may or may not have 100% sequence identity with 150, 200, 250, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 1000, 1050, 1100, 1150, 1200, 1250, or 1300 contiguous base pairs starting at nucleotide position 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168, 169, 170, 171, 172, 173, 174, 175, 176, 177, 178, 179, 180, 181, 182, 183, 184, 185, 186, 187, 188, 189, 190, 191, 192, 193, 194, 195, 196, 197, 198, 199, 200, 201, 202, 203, 204, 205, 206, 207, 208, 209, 210, 211, 212, 213, 214, 215, 216, 217, 218, 219, 220, 221, 222, 223, 224, 225, 226, 227, 228, 229, 230, 231, 232, 233, 234, 235, 236, 237, 238, 239, 240, 241, 242, 243, 244, 245, 246, 247, 248, 249, 250, 251, 252, 253, 254, 255, 256, 257, 258, 259, 260, 261, 262, 263, 264, 265, 266, 267, 268, 269, 270, 271, 272, 273, 274, 275, 276, 277, 278, 279, 280, 281, 282, 283, 284, 285, 286, 287, 288, 289, 290, 291, 292, 293, 294, 295, 296, 297, 298, 299, 300, 301, 302, 303, 304, 305, 306, 307, 308, 309, 310, 311, 312, 313, 314, 315, 316, 317, 318, 319, 320, 321, 322, 323, 324, 325, 326, 327, 328, 329, 330, 331, 332, 333, 334, 335, 336, 337, 338, 339, 340, 341, 342, 343, 344, 345, 346, 347, 348, 349, 350, 351, 352, 353, 354, 355, 356, 357, 358, 359, 360, 361, 362, 363, 364, 365, 366, 367, 368, 369, 370, 371, 372, 373, 374, 375, 376, 377, 378, 379, 380, 381, 382, 383, 384, 385, 386, 387, 388, 389, 390, 391, 392, 393, 394, 395, 396, 397, 398, 399, 400, 401, 402, 403, 404, 405, 406, 407, 408, 409, 410, 411, 412, 413, 414, 415, 416, 417, 418, 419, 420, 421, 422, 423, 424, 425, 426, 427, 428, 429, 430, 431, 432, 433, 434, 435, 436, 437, 438, 439, 440, 441, 442, 443, 444, 445, 446, 447, 448, 449, 450, 451, 452, 453, 454, 455, 456, 457, 458, 459, 460, 461, 462, 463, 464, 465, 466, 467, 468, 469, 470, 471, 472, 473, 474, 475, 476, 477, 478, 479, 480, 481, 482, 483, 484, 485, 486, 487, 488, 489, 490, 491, 492, 493, 494, 495, 496, 497, 498, 499, 500, 501, 502, 503, 504, 505, 506, 507, 508, 509, 510, 511, 512, 513, 514, 515, 516, 517, 518, 519, 520, 521, 522, 523, 524, 525, 526, 527, 528, 529, 530, 531, 532, 533, 534, 535, 536, 537, 538, 539, 540, 541, 542, 543, 544, 545, 546, 547, 548, 549, 550, 551, 552, 553, 554, 555, 556, 557, 558, 559, 560, 561, 562, 563, 564, 565, 566, 567, 568, 569, 570, 571, 572, 573, 574, 575, 576, 577, 578, 579, 580, 581, 582, 583, 584, 585, 586, 587, 588, 589, 590, 591, 592, 593, 594, 595, 596, 597, 598, 599, 600, 601, 602, 603, 604, 605, 606, 607, 608, 609, 610, 611, 612, 613, 614, 615, 616, 617, 618, 619, 620, 621, 622, 623, 624, 625, 626, 627, 628, 629, 630, 631, 632, 633, 634, 635, 636, 637, 638, 639, 640, 641, 642, 643, 644, 645, 646, 647, 648, 649, 650, 651, 652, 653, 654, 655, 656, 657, 658, 659, 660, 661, 662, 663, 664, 665, 666, 667, 668, 669, 670, 671, 672, 673, 674, 675, 676, 677, 678, 679, 680, 681, 682, 683, 684, 685, 686, 687, 688, 689, 690, 691, 692, 693, 694, 695, 696, 697, 698, 699, 700, 701, 702, 703, 704, 705, 706, 707, 708, 709, 710, 711, 712, 713, 714, 715, 716, 717, 718, 719, 720, 721, 722, 723, 724, 725, 726, 727, 728, 729, 730, 731, 732, 733, 734, 735, 736, 737, 738, 739, 740, 741, 742, 743, 744, 745, 746, 747, 748, 749, 750, 751, 752, 753, 754, 755, 756, 757, 758, 759, 760, 761, 762, 763, 764, 765, 766, 767, 768, 769, 770, 771, 772, 773, 774, 775, 776, 777, 778, 779, 780, 781, 782, 783, 784, 785, 786, 787, 788, 789, 790, 791, 792, 793, 794, 795, 796, 797, 798, 799, 800, 801, 802, 803, 804, 805, 806, 807, 808, 809, 810, 811, 812, 813, 814, 815, 816, 817, 818, 819, 820, 821, 822, 823, 824, 825, 826, 827, 828, 829, 830, 831, 832, 833, 834, 835, 836, 837, 838, 839, 840, 841, 842, 843, 844, 845, 846, 847, 848, 849, 850, 851, 852, 853, 854, 855, 856, 857, 858, 859, 860, 861, 862, 863, 864, 865, 866, 867, 868, 869, 870, 871, 872, 873, 874, 875, 876, 877, 878, 879, 880, 881, 882, 883, 884, 885, 886, 887, 888, 889, 890, 891, 892, 893, 894, 895, 896, 897, 898, 899, 900, 901, 902, 903, 904, 905, 906, 907, 908, 909, 910, 911, 912, 913, 914, 915, 916, 917, 918, 919, 920, 921, 922, 923, 924, 925, 926, 927, 928, 929, 930, 931, 932, 933, 934, 935, 936, 937, 938, 939, 940, 941, 942, 943, 944, 945, 946, 947, 948, 949, 950, 951, 952, 953, 954, 955, 956, 957, 958, 959, 960, 961, 962, 963, 964, 965, 966, 967, 968, 969, 970, 971, 972, 973, 974, 975, 976, 977, 978, 979, 980, 981, 982, 983, 984, 985, 986, 987, 988, 989, 990, 991, 992, 993, 994, 995, 996, 997, 998, 999, 1000, 1001, 1002, 1003, 1004, 1005, 1006, 1007, 1008, 1009, 1010, 1011, 1012, 1013, 1014, 1015, 1016, 1017, 1018, 1019, 1020, 1021, 1022, 1023, 1024, 1025, 1026, 1027, 1028, 1029, 1030, 1031, 1032, 1033, 1034, 1035, 1036, 1037, 1038, 1039, 1040, 1041, 1042, 1043, 1044, 1045, 1046, 1047, 1048, 1049, 1050, 1051, 1052, 1053, 1054, 1055, 1056, 1057, 1058, 1059, 1060, 1061, 1062, 1063, 1064, 1065, 1066, 1067, 1068, 1069, 1070, 1071, 1072, 1073, 1074, 1075, 1076, 1077, 1078, 1079, 1080, 1081, 1082, 1083, 1084, 1085, 1086, 1087, 1088, 1089, 1090, 1091, 1092, 1093, 1094, 1095, 1096, 1097, 1098, 1099, 1100, 1101, 1102, 1103, 1104, 1105, 1106, 1107, 1108, 1109, 1110, 1111, 1112, 1113, 1114, 1115, 1116, 1117, 1118, 1119, 1120, 1121, 1122, 1123, 1124, 1125, 1126, 1127, 1128, 1129, 1130, 1131, 1132, 1133, 1134, 1135, 1136, 1137, 1138, 1139, 1140, 1141, 1142, 1143, 1144, 1145, 1146, 1147, 1148, 1149, 1150, 1151, 1152, 1153, 1154, 1155, 1156, 1157, 1158, 1159, 1160, 1161, 1162, 1163, 1164, 1165, 1166, 1167, 1168, 1169, 1170, 1171, 1172, 1173, 1174, 1175, 1176, 1177, 1178, 1179, 1180, 1181, 1182, 1183, 1184, 1185, 1186, 1187, 1188, 1189, 1190, 1191, 1192, 1193, 1194, 1195, 1196, 1197, 1198, 1199, or 1200 of the nucleotide sequence set forth in SEQ ID NO:1, SEQ ID NO:3, SEQ ID NO:5, SEQ ID NO:7, SEQ ID NO:9, SEQ ID NO:11, SEQ ID NO:13, SEQ ID NO:15, or SEQ ID NO:17. A recombinant methyltransferase gene may comprise a nucleotide sequence with, with at least, or with at most 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% sequence identity with the nucleotide sequence set forth in SEQ ID NO:1, SEQ ID NO:3, SEQ ID NO:5, SEQ ID NO:7, SEQ ID NO:9, SEQ ID NO:11, SEQ ID NO:13, SEQ ID NO:15, or SEQ ID NO:17, and the recombinant methyltransferase gene may encode a methyltransferase protein with, with at least, or with at most 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% sequence identity with the amino acid sequence set forth in SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:6, SEQ ID NO:8, SEQ ID NO:10, SEQ ID NO:12, SEQ ID NO:14, SEQ ID NO:16, or SEQ ID NO:18. For example, a gene that is codon-optimized for expression in yeast may have about 70% sequence identity with SEQ ID NO:1, while the protein encoded by such a codon-optimized gene may have 100% sequence identity with the amino acid sequence set forth in SEQ ID NO:2. Thus, even though a codon-optimized gene may have only about 70% sequence identity or less to the original gene, the codon-optimized gene encodes the same amino acid sequence of the original gene.

A recombinant methyltransferase gene may vary from a naturally-occurring methyltransferase gene because the recombinant methyltransferase gene may be codon-optimized for expression in a eukaryotic cell, such as a plant cell, algae cell, or yeast cell. A cell may comprise a recombinant methyltransferase gene, wherein the recombinant methyltransferase gene is codon-optimized for the cell.

Exactly, at least, or at most 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168, 169, 170, 171, 172, 173, 174, 175, 176, 177, 178, 179, 180, 181, 182, 183, 184, 185, 186, 187, 188, 189, 190, 191, 192, 193, 194, 195, 196, 197, 198, 199, 200, 201, 202, 203, 204, 205, 206, 207, 208, 209, 210, 211, 212, 213, 214, 215, 216, 217, 218, 219, 220, 221, 222, 223, 224, 225, 226, 227, 228, 229, 230, 231, 232, 233, 234, 235, 236, 237, 238, 239, 240, 241, 242, 243, 244, 245, 246, 247, 248, 249, 250, 251, 252, 253, 254, 255, 256, 257, 258, 259, 260, 261, 262, 263, 264, 265, 266, 267, 268, 269, 270, 271, 272, 273, 274, 275, 276, 277, 278, 279, 280, 281, 282, 283, 284, 285, 286, 287, 288, 289, 290, 291, 292, 293, 294, 295, 296, 297, 298, 299, 300, 301, 302, 303, 304, 305, 306, 307, 308, 309, 310, 311, 312, 313, 314, 315, 316, 317, 318, 319, 320, 321, 322, 323, 324, 325, 326, 327, 328, 329, 330, 331, 332, 333, 334, 335, 336, 337, 338, 339, 340, 341, 342, 343, 344, 345, 346, 347, 348, 349, 350, 351, 352, 353, 354, 355, 356, 357, 358, 359, 360, 361, 362, 363, 364, 365, 366, 367, 368, 369, 370, 371, 372, 373, 374, 375, 376, 377, 378, 379, 380, 381, 382, 383, 384, 385, 386, 387, 388, 389, 390, 391, 392, 393, 394, 395, 396, 397, 398, 399, 400, 401, 402, 403, 404, 405, 406, 407, 408, 409, 410, 411, 412, 413, 414, 415, 416, 417, 418, 419, 420, 421, 422, 423, 424, 425, 426, 427, 428, 429, 430, 431, 432, 433, 434, 435, 436, 437, 438, 439, 440, 441, 442, 443, 444, 445, 446, 447, 448, 449, 450, 451, 452, 453, 454, 455, 456, 457, 458, 459, 460, 461, 462, 463, 464, 465, 466, 467, 468, 469, 470, 471, 472, 473, 474, 475, 476, 477, 478, 479, 480, 481, 482, 483, 484, 485, 486, 487, 488, 489, 490, 491, 492, 493, 494, 495, 496, 497, 498, 499, or 500 codons of a recombinant methyltransferase gene may vary from a naturally-occurring methyltransferase gene or may be unchanged from a naturally-occurring methyltransferase gene. For example, a recombinant methyltransferase gene may comprise a nucleotide sequence with at least about 65% sequence identity with the naturally-occurring nucleotide sequence set forth in SEQ ID NO:1, SEQ ID NO:3, SEQ ID NO:5, SEQ ID NO:7, SEQ ID NO:9, SEQ ID NO:11, SEQ ID NO:13, SEQ ID NO:15, or SEQ ID NO:17 (e.g., at least about 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% sequence identity), and at least 5 codons of the nucleotide sequence of the recombinant methyltransferase gene may vary from the naturally-occurring nucleotide sequence (e.g., at least about 10, 15, 20, 25, 30, 35, 40, 45, 50, 60, 70, 80, 90, or 100 codons).

A methyltransferase gene encodes a methyltransferase protein. A methyltransferase protein may be a protein expressed by a species of Gammaproteobacteria, such as bacteria from the genera Desulfobacter, Desulfobacula, Marinobacter, Thiohalospira, or Halofilum. A recombinant methyltransferase gene may encode a naturally-occurring methyltransferase protein even if the recombinant methyltransferase gene is not a naturally-occurring methyltransferase gene. For example, a recombinant methyltransferase gene may vary from a naturally-occurring methyltransferase gene because the recombinant methyltransferase gene is codon-optimized for expression in a specific cell. The codon-optimized, recombinant methyltransferase gene and the naturally-occurring methyltransferase gene may nevertheless encode the same naturally-occurring methyltransferase protein.

A recombinant methyltransferase gene may encode a methyltransferase protein selected from the group consisting of Desulfobacula balticum protein tmpB (SEQ ID NO:2), Marinobacter hydrocarbonclasticus protein tmpB (SEQ ID NO:4), Thiohalospira halophila protein tmpB (SEQ ID NO:6), Desulfobacter curvatus protein tmpB (SEQ ID NO:8), Desulfobacter phenolica protein tmpB (SEQ ID NO:10), Desulfobacula toluolica protein tmpB (SEQ ID NO:12), Desulfobacter postgatei protein tmpB (SEQ ID NO:14), Halofilum ochraceum protein tmpB (SEQ ID NO:16), and Marinobacter aquaeolei protein tmpB (SEQ ID NO:18). It is specifically contemplated that one or more of the above methyltransferase proteins may be excluded from embodiments of this invention. A recombinant methyltransferase gene may encode a methyltransferase protein, and the methyltransferase protein may be substantially identical to any one of the foregoing enzymes, but the recombinant methyltransferase gene may vary from the naturally-occurring gene that encodes the enzyme. The recombinant methyltransferase gene may vary from the naturally-occurring gene because the recombinant methyltransferase gene may be codon-optimized for expression in a specific phylum, class, order, family, genus, species, or strain of cell.

The sequences of naturally-occurring methyltransferase proteins are set forth in SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:6, SEQ ID NO:8, SEQ ID NO:10, SEQ ID NO:12, SEQ ID NO:14, SEQ ID NO:16, and SEQ ID NO:18. A recombinant methyltransferase gene may or may not encode a protein comprising 100% sequence identity with the amino acid sequence set forth in SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:6, SEQ ID NO:8, SEQ ID NO:10, SEQ ID NO:12, SEQ ID NO:14, SEQ ID NO:16, or SEQ ID NO:18. For example, a recombinant methyltransferase gene may encode a protein having 100% sequence identity with a biologically-active portion of an amino acid sequence set forth in SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:6, SEQ ID NO:8, SEQ ID NO:10, SEQ ID NO:12, SEQ ID NO:14, SEQ ID NO:16, or SEQ ID NO:18.

A recombinant methyltransferase gene may encode a methyltransferase protein having, having at least, or having at most 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, 100% sequence identity with the amino acid sequence set forth in SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:6, SEQ ID NO:8, SEQ ID NO:10, SEQ ID NO:12, SEQ ID NO:14, SEQ ID NO:16, or SEQ ID NO:18, or a biologically-active portion thereof. A recombinant methyltransferase gene may encode a methyltransferase protein having about, at least about, or at most about 0.1%, 0.5%, 1%, 2%, 3%, 4%, 5%, 10%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, 99.5%, 99.6%, 99.7%, 99.8%, 99.9%, 100%, 100.1%, 100.2%, 100.3%, 100.4%, 100.5%, 100.6%, 100.7%, 100.8%, 100.9%, 101%, 105%, 110%, 115%, 120%, 125%, 130%, 135%, 140%, 145%, 150%, 160%, 170%, 180%, 190%, 200%, 220%, 240%, 260%, 280%, 300%, 320%, 340%, 360%, 380%, or 400% methyltransferase activity relative to a protein comprising the amino acid sequence set forth in SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:6, SEQ ID NO:8, SEQ ID NO:10, SEQ ID NO:12, SEQ ID NO:14, SEQ ID NO:16, or SEQ ID NO:18. A recombinant methyltransferase gene may encode a protein having at least 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% sequence identity with 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 110, 120, 130, 140, 150, 160, 170, 180, 190, 200, 210, 220, 230, 240, 250, 260, 270, 280, 290, 300, 310, 320, 330, 340, 350, 360, 370, 380, 390, 400, 410, 420, 430, 440, 450, 460, 470, 480, 490, or 500 contiguous amino acids starting at position 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168, 169, 170, 171, 172, 173, 174, 175, 176, 177, 178, 179, 180, 181, 182, 183, 184, 185, 186, 187, 188, 189, 190, 191, 192, 193, 194, 195, 196, 197, 198, 199, 200, 201, 202, 203, 204, 205, 206, 207, 208, 209, 210, 211, 212, 213, 214, 215, 216, 217, 218, 219, 220, 221, 222, 223, 224, 225, 226, 227, 228, 229, 230, 231, 232, 233, 234, 235, 236, 237, 238, 239, 240, 241, 242, 243, 244, 245, 246, 247, 248, 249, 250, 251, 252, 253, 254, 255, 256, 257, 258, 259, 260, 261, 262, 263, 264, 265, 266, 267, 268, 269, 270, 271, 272, 273, 274, 275, 276, 277, 278, 279, 280, 281, 282, 283, 284, 285, 286, 287, 288, 289, 290, 291, 292, 293, 294, 295, 296, 297, 298, 299, 300, 301, 302, 303, 304, 305, 306, 307, 308, 309, 310, 311, 312, 313, 314, 315, 316, 317, 318, 319, 320, 321, 322, 323, 324, 325, 326, 327, 328, 329, 330, 331, 332, 333, 334, 335, 336, 337, 338, 339, 340, 341, 342, 343, 344, 345, 346, 347, 348, 349, 350, 351, 352, 353, 354, 355, 356, 357, 358, 359, 360, 361, 362, 363, 364, 365, 366, 367, 368, 369, 370, 371, 372, 373, 374, 375, 376, 377, 378, 379, 380, 381, 382, 383, 384, 385, 386, 387, 388, 389, 390, 391, 392, 393, 394, 395, 396, 397, 398, 399, 400, 401, 402, 403, 404, 405, 406, 407, 408, 409, 410, 411, 412, 413, 414, 415, 416, 417, 418, 419, 420, 421, 422, 423, 424, 425, 426, 427, 428, 429, 430, 431, 432, 433, 434, 435, 436, 437, 438, 439, 440, 441, 442, 443, 444, 445, 446, 447, 448, 449, 450, 451, 452, 453, 454, 455, 456, 457, 458, 459, 460, 461, 462, 463, 464, 465, 466, 467, 468, 469, 470, 471, 472, 473, 474, 475, 476, 477, 478, 479, 480, 481, 482, 483, 484, 485, 486, 487, 488, 489, 490, 491, 492, 493, 494, 495, 496, 497, 498, 499, or 500 of the amino acid sequence set forth in SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:6, SEQ ID NO:8, SEQ ID NO:10, SEQ ID NO:12, SEQ ID NO:14, SEQ ID NO:16, or SEQ ID NO:18.

Substrates for the methyltransferase protein may include any fatty acid from 14 to 18 carbons long with an unsaturated double bond in the Δ9, Δ10, or Δ11 position. The substrate may have a chain that is 14, 15, 16, 17, or 18 carbons long, or any range derivable therein. The methyltransferase protein may be capable of catalyzing the formation of a methylene substitution at the Δ9, Δ10, or Δ11 position of such a substrate.

In some embodiments, the recombinant methyltransferase gene encodes a methyltransferase protein that has specific amino acids unchanged from the amino acid sequence set forth in SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:6, SEQ ID NO:8, SEQ ID NO:10, SEQ ID NO:12, SEQ ID NO:14, SEQ ID NO:16, or SEQ ID NO:18. The unchanged amino acids can include 1, 2, 3, 4, 5, 6, 7, 8, or 9 amino acids selected from Y163, T175, R199, E211, G269, Y271, N313, N319, and W389 of Marinobacter hydrocarbonclasticus tmpB or corresponding amino acids in tmpB from Desulfobacula balticum, Thiohalospira halophila, Desulfobacter curvatus, Desulfobacter phenolica, Desulfobacula toluolica, Desulfobacter postgatei, Halofilum ochraceum, or Marinobacter aquaeolei, according to the alignment set forth in FIGS. 7A-D.

c. Nucleic Acids Comprising a Recombinant Reductase Gene

A reductase gene (e.g., a recombinant reductase gene) encodes a reductase protein, which is an enzyme capable of reducing a double bond of a fatty acid (e.g., wherein the fatty acid is present as a free fatty acid, carboxylate, phospholipid, diacylglycerol, or triacylglycerol). The reductase gene (e.g., a recombinant reductase gene) may have a coding region that is identical to one from a bacterium of the class Gammaproteobacteria. The reductase gene may comprise any one of the nucleotide sequences set forth in SEQ ID NO:19, SEQ ID NO:21, SEQ ID NO:23, SEQ ID NO:25, SEQ ID NO:27, SEQ ID NO:29, SEQ ID NO:31, SEQ ID NO:33, and SEQ ID NO:35. The reductase gene (e.g., a recombinant reductase gene) may be a 10-methylstearic A gene (tmpA) as described herein, or a biologically-active portion thereof (i.e., wherein the biologically-active portion thereof comprises reductase activity).

The reductase gene (e.g., a recombinant reductase gene) may be derived from a species of Gammaproteobacteria, such as bacteria from the genera Desulfobacter, Desulfobacula, Marinobacter, Thiohalospira, or Halofilum. The reductase gene (e.g., a recombinant reductase gene) may be selected from the group consisting of Desulfobacula balticum gene tmpA (SEQ ID NO:19), Marinobacter hydrocarbonclasticus gene tmpA (SEQ ID NO:21), Thiohalospira halophila gene tmpA (SEQ ID NO:23), Desulfobacter curvatus gene tmpA (SEQ ID NO:25), Desulfobacter phenolica gene tmpA (SEQ ID NO:27), Desulfobacula toluolica gene tmpA (SEQ ID NO:29), Desulfobacter postgatei gene tmpA (SEQ ID NO:31), Halofilum ochraceum gene tmpA (SEQ ID NO:33), and Marinobacter aquaeolei gene tmpA (SEQ ID NO:35). It is specifically contemplated that one or more of the above reductase genes may be excluded from embodiments of this invention.

A recombinant reductase gene may be recombinant because it is operably linked to a promoter other than the naturally-occurring promoter of the reductase gene. Such genes may be useful to drive transcription in a particular species of cell. A recombinant reductase gene may be recombinant because it contains one or more nucleotide substitutions relative to a naturally-occurring reductase gene. Such genes may be useful to increase the translation efficiency of the reductase gene's mRNA transcript in a particular species of cell.

A nucleic acid may comprise a recombinant reductase gene and a promoter, wherein the recombinant reductase gene and promoter are operably linked. The recombinant reductase gene and promoter may be derived from different species. For example, the recombinant reductase gene may encode the reductase protein of a species of Gammaproteobacteria, and the recombinant reductase gene may be operably-linked to a promoter that can drive transcription in another type of bacteria or a eukaryote (e.g., an algae cell, yeast cell, or plant cell). The promoter may be a eukaryotic promoter. A cell may comprise the nucleic acid, and the promoter may be capable of driving transcription in the cell. A cell may comprise a recombinant reductase gene, and the recombinant reductase gene may be operably linked to a promoter capable of driving transcription of the recombinant reductase gene in the cell. The cell may be a species of yeast, and the promoter may be a yeast promoter. The cell may be a species of bacteria, and the promoter may be a bacterial promoter (e.g., wherein the bacterial promoter is not a promoter from a Gammaproteobacterium). The cell may be a species of algae, and the promoter may be an algae promoter. The cell may be a species of plant, and the promoter may be a plant promoter.

A recombinant reductase gene may be operably linked to a promoter that cannot drive transcription in the cell from which the recombinant reductase gene originated. For example, the promoter may not be capable of binding an RNA polymerase of the cell from which a recombinant reductase gene originated. In some embodiments, the promoter cannot bind a prokaryotic RNA polymerase and/or initiate transcription mediated by a prokaryotic RNA polymerase. In some embodiments, a recombinant reductase gene is operably-linked to a promoter that cannot drive transcription in the cell from which the protein encoded by the gene originated. For example, the promoter may not be capable of binding an RNA polymerase of a cell that naturally expresses the reductase enzyme encoded by a recombinant reductase gene.

A recombinant reductase gene may comprise a nucleotide sequence with, with at least, or with at most 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% sequence identity with the nucleotide sequence set forth in SEQ ID NO:19, SEQ ID NO:21, SEQ ID NO:23, SEQ ID NO:25, SEQ ID NO:27, SEQ ID NO:29, SEQ ID NO:31, SEQ ID NO:33, or SEQ ID NO:35. A recombinant reductase gene may comprise a nucleotide sequence with, with at least, or with at most 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% sequence identity with 150, 200, 250, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 1000, 1050, 1100, 1150, 1200, 1250, or 1300 contiguous base pairs starting at nucleotide position 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168, 169, 170, 171, 172, 173, 174, 175, 176, 177, 178, 179, 180, 181, 182, 183, 184, 185, 186, 187, 188, 189, 190, 191, 192, 193, 194, 195, 196, 197, 198, 199, 200, 201, 202, 203, 204, 205, 206, 207, 208, 209, 210, 211, 212, 213, 214, 215, 216, 217, 218, 219, 220, 221, 222, 223, 224, 225, 226, 227, 228, 229, 230, 231, 232, 233, 234, 235, 236, 237, 238, 239, 240, 241, 242, 243, 244, 245, 246, 247, 248, 249, 250, 251, 252, 253, 254, 255, 256, 257, 258, 259, 260, 261, 262, 263, 264, 265, 266, 267, 268, 269, 270, 271, 272, 273, 274, 275, 276, 277, 278, 279, 280, 281, 282, 283, 284, 285, 286, 287, 288, 289, 290, 291, 292, 293, 294, 295, 296, 297, 298, 299, 300, 301, 302, 303, 304, 305, 306, 307, 308, 309, 310, 311, 312, 313, 314, 315, 316, 317, 318, 319, 320, 321, 322, 323, 324, 325, 326, 327, 328, 329, 330, 331, 332, 333, 334, 335, 336, 337, 338, 339, 340, 341, 342, 343, 344, 345, 346, 347, 348, 349, 350, 351, 352, 353, 354, 355, 356, 357, 358, 359, 360, 361, 362, 363, 364, 365, 366, 367, 368, 369, 370, 371, 372, 373, 374, 375, 376, 377, 378, 379, 380, 381, 382, 383, 384, 385, 386, 387, 388, 389, 390, 391, 392, 393, 394, 395, 396, 397, 398, 399, 400, 401, 402, 403, 404, 405, 406, 407, 408, 409, 410, 411, 412, 413, 414, 415, 416, 417, 418, 419, 420, 421, 422, 423, 424, 425, 426, 427, 428, 429, 430, 431, 432, 433, 434, 435, 436, 437, 438, 439, 440, 441, 442, 443, 444, 445, 446, 447, 448, 449, 450, 451, 452, 453, 454, 455, 456, 457, 458, 459, 460, 461, 462, 463, 464, 465, 466, 467, 468, 469, 470, 471, 472, 473, 474, 475, 476, 477, 478, 479, 480, 481, 482, 483, 484, 485, 486, 487, 488, 489, 490, 491, 492, 493, 494, 495, 496, 497, 498, 499, 500, 501, 502, 503, 504, 505, 506, 507, 508, 509, 510, 511, 512, 513, 514, 515, 516, 517, 518, 519, 520, 521, 522, 523, 524, 525, 526, 527, 528, 529, 530, 531, 532, 533, 534, 535, 536, 537, 538, 539, 540, 541, 542, 543, 544, 545, 546, 547, 548, 549, 550, 551, 552, 553, 554, 555, 556, 557, 558, 559, 560, 561, 562, 563, 564, 565, 566, 567, 568, 569, 570, 571, 572, 573, 574, 575, 576, 577, 578, 579, 580, 581, 582, 583, 584, 585, 586, 587, 588, 589, 590, 591, 592, 593, 594, 595, 596, 597, 598, 599, 600, 601, 602, 603, 604, 605, 606, 607, 608, 609, 610, 611, 612, 613, 614, 615, 616, 617, 618, 619, 620, 621, 622, 623, 624, 625, 626, 627, 628, 629, 630, 631, 632, 633, 634, 635, 636, 637, 638, 639, 640, 641, 642, 643, 644, 645, 646, 647, 648, 649, 650, 651, 652, 653, 654, 655, 656, 657, 658, 659, 660, 661, 662, 663, 664, 665, 666, 667, 668, 669, 670, 671, 672, 673, 674, 675, 676, 677, 678, 679, 680, 681, 682, 683, 684, 685, 686, 687, 688, 689, 690, 691, 692, 693, 694, 695, 696, 697, 698, 699, 700, 701, 702, 703, 704, 705, 706, 707, 708, 709, 710, 711, 712, 713, 714, 715, 716, 717, 718, 719, 720, 721, 722, 723, 724, 725, 726, 727, 728, 729, 730, 731, 732, 733, 734, 735, 736, 737, 738, 739, 740, 741, 742, 743, 744, 745, 746, 747, 748, 749, 750, 751, 752, 753, 754, 755, 756, 757, 758, 759, 760, 761, 762, 763, 764, 765, 766, 767, 768, 769, 770, 771, 772, 773, 774, 775, 776, 777, 778, 779, 780, 781, 782, 783, 784, 785, 786, 787, 788, 789, 790, 791, 792, 793, 794, 795, 796, 797, 798, 799, 800, 801, 802, 803, 804, 805, 806, 807, 808, 809, 810, 811, 812, 813, 814, 815, 816, 817, 818, 819, 820, 821, 822, 823, 824, 825, 826, 827, 828, 829, 830, 831, 832, 833, 834, 835, 836, 837, 838, 839, 840, 841, 842, 843, 844, 845, 846, 847, 848, 849, 850, 851, 852, 853, 854, 855, 856, 857, 858, 859, 860, 861, 862, 863, 864, 865, 866, 867, 868, 869, 870, 871, 872, 873, 874, 875, 876, 877, 878, 879, 880, 881, 882, 883, 884, 885, 886, 887, 888, 889, 890, 891, 892, 893, 894, 895, 896, 897, 898, 899, 900, 901, 902, 903, 904, 905, 906, 907, 908, 909, 910, 911, 912, 913, 914, 915, 916, 917, 918, 919, 920, 921, 922, 923, 924, 925, 926, 927, 928, 929, 930, 931, 932, 933, 934, 935, 936, 937, 938, 939, 940, 941, 942, 943, 944, 945, 946, 947, 948, 949, 950, 951, 952, 953, 954, 955, 956, 957, 958, 959, 960, 961, 962, 963, 964, 965, 966, 967, 968, 969, 970, 971, 972, 973, 974, 975, 976, 977, 978, 979, 980, 981, 982, 983, 984, 985, 986, 987, 988, 989, 990, 991, 992, 993, 994, 995, 996, 997, 998, 999, 1000, 1001, 1002, 1003, 1004, 1005, 1006, 1007, 1008, 1009, 1010, 1011, 1012, 1013, 1014, 1015, 1016, 1017, 1018, 1019, 1020, 1021, 1022, 1023, 1024, 1025, 1026, 1027, 1028, 1029, 1030, 1031, 1032, 1033, 1034, 1035, 1036, 1037, 1038, 1039, 1040, 1041, 1042, 1043, 1044, 1045, 1046, 1047, 1048, 1049, 1050, 1051, 1052, 1053, 1054, 1055, 1056, 1057, 1058, 1059, 1060, 1061, 1062, 1063, 1064, 1065, 1066, 1067, 1068, 1069, 1070, 1071, 1072, 1073, 1074, 1075, 1076, 1077, 1078, 1079, 1080, 1081, 1082, 1083, 1084, 1085, 1086, 1087, 1088, 1089, 1090, 1091, 1092, 1093, 1094, 1095, 1096, 1097, 1098, 1099, 1100, 1101, 1102, 1103, 1104, 1105, 1106, 1107, 1108, 1109, 1110, 1111, 1112, 1113, 1114, 1115, 1116, 1117, 1118, 1119, 1120, 1121, 1122, 1123, 1124, 1125, 1126, 1127, 1128, 1129, 1130, 1131, 1132, 1133, 1134, 1135, 1136, 1137, 1138, 1139, 1140, 1141, 1142, 1143, 1144, 1145, 1146, 1147, 1148, 1149, 1150, 1151, 1152, 1153, 1154, 1155, 1156, 1157, 1158, 1159, 1160, 1161, 1162, 1163, 1164, 1165, 1166, 1167, 1168, 1169, 1170, 1171, 1172, 1173, 1174, 1175, 1176, 1177, 1178, 1179, 1180, 1181, 1182, 1183, 1184, 1185, 1186, 1187, 1188, 1189, 1190, 1191, 1192, 1193, 1194, 1195, 1196, 1197, 1198, 1199, 1200 of the nucleotide sequence set forth in SEQ ID NO:19, SEQ ID NO:21, SEQ ID NO:23, SEQ ID NO:25, SEQ ID NO:27, SEQ ID NO:29, SEQ ID NO:31, SEQ ID NO:33, or SEQ ID NO:35. A recombinant reductase may or may not have 100% sequence identity with any one of the nucleotide sequences set forth in SEQ ID NO:19, SEQ ID NO:21, SEQ ID NO:23, SEQ ID NO:25, SEQ ID NO:27, SEQ ID NO:29, SEQ ID NO:31, SEQ ID NO:33, and SEQ ID NO:35. A recombinant reductase gene may or may not have 100% sequence identity with 150, 200, 250, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 1000, 1050, 1100, 1150, 1200, 1250, or 1300 contiguous base pairs of the nucleotide sequence set forth in SEQ ID NO:19, SEQ ID NO:21, SEQ ID NO:23, SEQ ID NO:25, SEQ ID NO:27, SEQ ID NO:29, SEQ ID NO:31, SEQ ID NO:33, or SEQ ID NO:35. A recombinant reductase gene may comprise a nucleotide sequence with, with at least, or with at most 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% sequence identity with the nucleotide sequence set forth in SEQ ID NO:19, SEQ ID NO:21, SEQ ID NO:23, SEQ ID NO:25, SEQ ID NO:27, SEQ ID NO:29, SEQ ID NO:31, SEQ ID NO:33, or SEQ ID NO:35, and the recombinant reductase gene may encode a reductase protein with, with at least, or with at most 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% sequence identity with the amino acid sequence set forth in SEQ ID NO:20, SEQ ID NO:22, SEQ ID NO:24, SEQ ID NO:26, SEQ ID NO:28, SEQ ID NO:30, SEQ ID NO:32, SEQ ID NO:34, or SEQ ID NO:36. For example, a gene that is codon-optimized for expression in yeast may have about 70% sequence identity with SEQ ID NO:19, while the protein encoded by such a codon-optimized gene may have 100% sequence identity with the amino acid sequence set forth in SEQ ID NO:20. Thus, even though a codon-optimized gene may have only about 70% sequence identity or less to the original gene, the codon-optimized gene encodes the same amino acid sequence of the original gene.

A recombinant reductase gene may vary from a naturally-occurring reductase gene because the recombinant reductase gene may be codon-optimized for expression in a eukaryotic cell, such as a plant cell, algae cell, or yeast cell. A cell may comprise a recombinant reductase gene, wherein the recombinant reductase gene is codon-optimized for the cell.

Exactly, at least, or at most 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168, 169, 170, 171, 172, 173, 174, 175, 176, 177, 178, 179, 180, 181, 182, 183, 184, 185, 186, 187, 188, 189, 190, 191, 192, 193, 194, 195, 196, 197, 198, 199, 200, 201, 202, 203, 204, 205, 206, 207, 208, 209, 210, 211, 212, 213, 214, 215, 216, 217, 218, 219, 220, 221, 222, 223, 224, 225, 226, 227, 228, 229, 230, 231, 232, 233, 234, 235, 236, 237, 238, 239, 240, 241, 242, 243, 244, 245, 246, 247, 248, 249, 250, 251, 252, 253, 254, 255, 256, 257, 258, 259, 260, 261, 262, 263, 264, 265, 266, 267, 268, 269, 270, 271, 272, 273, 274, 275, 276, 277, 278, 279, 280, 281, 282, 283, 284, 285, 286, 287, 288, 289, 290, 291, 292, 293, 294, 295, 296, 297, 298, 299, 300, 301, 302, 303, 304, 305, 306, 307, 308, 309, 310, 311, 312, 313, 314, 315, 316, 317, 318, 319, 320, 321, 322, 323, 324, 325, 326, 327, 328, 329, 330, 331, 332, 333, 334, 335, 336, 337, 338, 339, 340, 341, 342, 343, 344, 345, 346, 347, 348, 349, 350, 351, 352, 353, 354, 355, 356, 357, 358, 359, 360, 361, 362, 363, 364, 365, 366, 367, 368, 369, 370, 371, 372, 373, 374, 375, 376, 377, 378, 379, 380, 381, 382, 383, 384, 385, 386, 387, 388, 389, 390, 391, 392, 393, 394, 395, 396, 397, 398, 399, 400, 401, 402, 403, 404, 405, 406, 407, 408, 409, 410, 411, 412, 413, 414, 415, 416, 417, 418, 419, 420, 421, 422, 423, 424, 425, 426, 427, 428, 429, 430, 431, 432, 433, 434, 435, 436, 437, 438, 439, 440, 441, 442, 443, 444, 445, 446, 447, 448, 449, 450, 451, 452, 453, 454, 455, 456, 457, 458, 459, 460, 461, 462, 463, 464, 465, 466, 467, 468, 469, 470, 471, 472, 473, 474, 475, 476, 477, 478, 479, 480, 481, 482, 483, 484, 485, 486, 487, 488, 489, 490, 491, 492, 493, 494, 495, 496, 497, 498, 499, or 500 codons of a recombinant reductase gene may vary from a naturally-occurring reductase gene or may be unchanged from a naturally-occurring reductase gene. For example, a recombinant reductase gene may comprise a nucleotide sequence with at least about 65% sequence identity with the naturally-occurring nucleotide sequence set forth in SEQ ID NO:19, SEQ ID NO:21, SEQ ID NO:23, SEQ ID NO:25, SEQ ID NO:27, SEQ ID NO:29, SEQ ID NO:31, SEQ ID NO:33, or SEQ ID NO:35 (e.g., at least about 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% sequence identity), and at least 5 codons of the nucleotide sequence of the recombinant reductase gene may vary from the naturally-occurring nucleotide sequence (e.g., at least about 10, 15, 20, 25, 30, 35, 40, 45, 50, 60, 70, 80, 90, or 100 codons).

A reductase gene encodes a reductase protein. A reductase protein may be a protein expressed by a species of Gammaproteobacteria, such as bacteria from the genera Desulfobacter, Desulfobacula, Marinobacter, Thiohalospira, or Halofilum. A recombinant reductase gene may encode a naturally-occurring reductase protein even if the recombinant reductase gene is not a naturally-occurring reductase gene. For example, a recombinant reductase gene may vary from a naturally-occurring reductase gene because the recombinant reductase gene is codon-optimized for expression in a specific cell. The codon-optimized, recombinant reductase gene and the naturally-occurring reductase gene may nevertheless encode the same naturally-occurring reductase protein.

A recombinant reductase gene may encode a reductase protein selected from the group consisting of Desulfobacula balticum protein tmpA (SEQ ID NO:20), Marinobacter hydrocarbonclasticus protein tmpA (SEQ ID NO:22), Thiohalospira halophila protein tmpA (SEQ ID NO:24), Desulfobacter curvatus protein tmpA (SEQ ID NO:26), Desulfobacter phenolica protein tmpA (SEQ ID NO:28), Desulfobacula toluolica protein tmpA (SEQ ID NO:30), Desulfobacter postgatei protein tmpA (SEQ ID NO:32), Halofilum ochraceum protein tmpA (SEQ ID NO:34), and Marinobacter aquaeolei protein tmpA (SEQ ID NO:36). It is specifically contemplated that one or more of the above reductase proteins may be excluded from embodiments of this invention. A recombinant reductase gene may encode a reductase protein, and the reductase protein may be substantially identical to any one of the foregoing enzymes, but the recombinant reductase gene may vary from the naturally-occurring gene that encodes the enzyme. The recombinant reductase gene may vary from the naturally-occurring gene because the recombinant reductase gene may be codon-optimized for expression in a specific phylum, class, order, family, genus, species, or strain of cell.

The sequences of naturally-occurring reductase proteins are set forth in SEQ ID NO:20, SEQ ID NO:22, SEQ ID NO:24, SEQ ID NO:26, SEQ ID NO:28, SEQ ID NO:30, SEQ ID NO:32, SEQ ID NO:34, and SEQ ID NO:36. A recombinant reductase gene may or may not encode a protein comprising 100% sequence identity with the amino acid sequence SEQ ID NO:30, SEQ ID NO:32, SEQ ID NO:34, or SEQ ID NO:36. For example, a recombinant reductase gene may encode a protein having 100% sequence identity with a biologically-active portion of an amino acid sequence set forth in SEQ ID NO:20, SEQ ID NO:22, SEQ ID NO:24, SEQ ID NO:26, SEQ ID NO:28, SEQ ID NO:30, SEQ ID NO:32, SEQ ID NO:34, or SEQ ID NO:36.

A recombinant reductase gene may encode a reductase protein having, having at least, or having at most 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% sequence identity with the amino acid sequence set forth in SEQ ID NO:20, SEQ ID NO:22, SEQ ID NO:24, SEQ ID NO:26, SEQ ID NO:28, SEQ ID NO:30, SEQ ID NO:32, SEQ ID NO:34, or SEQ ID NO:36, or a biologically-active portion thereof. A recombinant reductase gene may encode a reductase protein having about, at least about, or at most about 0.1%, 0.5%, 1%, 2%, 3%, 4%, 5%, 10%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, 99.5%, 99.6%, 99.7%, 99.8%, 99.9%, 100%, 100.1%, 100.2%, 100.3%, 100.4%, 100.5%, 100.6%, 100.7%, 100.8%, 100.9%, 101%, 105%, 110%, 115%, 120%, 125%, 130%, 135%, 140%, 145%, 150%, 160%, 170%, 180%, 190%, 200%, 220%, 240%, 260%, 280%, 300%, 320%, 340%, 360%, 380%, or 400% reductase activity relative to a protein comprising the amino acid sequence set forth in SEQ ID NO:20, SEQ ID NO:22, SEQ ID NO:24, SEQ ID NO:26, SEQ ID NO:28, SEQ ID NO:30, SEQ ID NO:32, SEQ ID NO:34, or SEQ ID NO:36. A recombinant reductase gene may encode a protein having, having at least, or having at most 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% sequence identity with 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 110, 120, 130, 140, 150, 160, 170, 180, 190, 200, 210, 220, 230, 240, 250, 260, 270, 280, 290, 300, 310, 320, 330, 340, 350, 360, 370, 380, 390, 400, 410, 420, 430, 440, 450, 460, 470, 480, 490, or 500 contiguous amino acids starting at amino acid position 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168, 169, 170, 171, 172, 173, 174, 175, 176, 177, 178, 179, 180, 181, 182, 183, 184, 185, 186, 187, 188, 189, 190, 191, 192, 193, 194, 195, 196, 197, 198, 199, 200, 201, 202, 203, 204, 205, 206, 207, 208, 209, 210, 211, 212, 213, 214, 215, 216, 217, 218, 219, 220, 221, 222, 223, 224, 225, 226, 227, 228, 229, 230, 231, 232, 233, 234, 235, 236, 237, 238, 239, 240, 241, 242, 243, 244, 245, 246, 247, 248, 249, 250, 251, 252, 253, 254, 255, 256, 257, 258, 259, 260, 261, 262, 263, 264, 265, 266, 267, 268, 269, 270, 271, 272, 273, 274, 275, 276, 277, 278, 279, 280, 281, 282, 283, 284, 285, 286, 287, 288, 289, 290, 291, 292, 293, 294, 295, 296, 297, 298, 299, 300, 301, 302, 303, 304, 305, 306, 307, 308, 309, 310, 311, 312, 313, 314, 315, 316, 317, 318, 319, 320, 321, 322, 323, 324, 325, 326, 327, 328, 329, 330, 331, 332, 333, 334, 335, 336, 337, 338, 339, 340, 341, 342, 343, 344, 345, 346, 347, 348, 349, 350, 351, 352, 353, 354, 355, 356, 357, 358, 359, 360, 361, 362, 363, 364, 365, 366, 367, 368, 369, 370, 371, 372, 373, 374, 375, 376, 377, 378, 379, 380, 381, 382, 383, 384, 385, 386, 387, 388, 389, 390, 391, 392, 393, 394, 395, 396, 397, 398, 399, 400, 401, 402, 403, 404, 405, 406, 407, 408, 409, 410, 411, 412, 413, 414, 415, 416, 417, 418, 419, 420, 421, 422, 423, 424, 425, 426, 427, 428, 429, 430, 431, 432, 433, 434, 435, 436, 437, 438, 439, 440, 441, 442, 443, 444, 445, 446, 447, 448, 449, 450, 451, 452, 453, 454, 455, 456, 457, 458, 459, 460, 461, 462, 463, 464, 465, 466, 467, 468, 469, 470, 471, 472, 473, 474, 475, 476, 477, 478, 479, 480, 481, 482, 483, 484, 485, 486, 487, 488, 489, 490, 491, 492, 493, 494, 495, 496, 497, 498, 499, or 500 of the amino acid sequence set forth in SEQ ID NO:20, SEQ ID NO:22, SEQ ID NO:24, SEQ ID NO:26, SEQ ID NO:28, SEQ ID NO:30, SEQ ID NO:32, SEQ ID NO:34, or SEQ ID NO:36.

Substrates for the reductase protein may include any fatty acid from 14 to 18 carbons long with a methylene substitution in the Δ9, Δ10, or Δ11 position. The substrate may be 14, 15, 16, 17, or 18 carbons long, or any range derivable therein. The reductase protein may be capable of catalyzing the reduction of a methylene-substituted fatty acid substrate to a (methyl)lipid. The reductase protein, together with a methyltransferase protein, may be capable of catalyzing the production of a methylated branch from any fatty acid from 14 to 18 carbons long with an unsaturated double bond in the Δ9, Δ10, or Δ11 position, including fatty acids that are 14, 15, 16, 17, or 18 carbons long, or any range derivable therein.

In some embodiments, the recombinant reductase gene encodes a reductase protein that has specific amino acids unchanged from the amino acid sequence set forth in SEQ ID NO:20, SEQ ID NO:22, SEQ ID NO:24, SEQ ID NO:26, SEQ ID NO:28, SEQ ID NO:30, SEQ ID NO:32, SEQ ID NO:34, or SEQ ID NO:36. The unchanged amino acids can include 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, or 25 amino acids selected from 18, L22, F37, P38, R39, K41, G45, W46, P49, G144, C148, P149, E169, E171, L197, I212, C249, H250, Y252, I270, G275, L276, E283, A296, and A299 of Marinobacter hydrocarbonclasticus tmpA or corresponding amino acids in tmpA from Desulfobacula balticum, Thiohalospira halophila, Desulfobacter curvatus, Desulfobacter phenolica, Desulfobacula toluolica, Desulfobacter postgatei, Halofilum ochraceum, or Marinobacter aquaeolei, according to the alignment set forth in FIGS. 8A-D.

As used herein, the term “complementary” and derivatives thereof are used in reference to pairing of nucleic acids by the well-known rules that A pairs with T or U and C pairs with G. Complement can be “partial” or “complete”. In partial complement, only some of the nucleic acid bases are matched according to the base pairing rules; while in complete or total complement, all the bases are matched according to the pairing rule. The degree of complement between the nucleic acid strands may have significant effects on the efficiency and strength of hybridization between nucleic acid strands as well known in the art. The efficiency and strength of said hybridization depends upon the detection method.

Any nucleic acid that is referred to herein as having a certain percent sequence identity to a sequence set forth in a SEQ ID NO, includes nucleic acids that have the certain percent sequence identity to the complement of the sequence set forth in the SEQ ID NO.

d. Nucleic Acids Comprising a Recombinant Methyltransferase Gene and a Recombinant Reductase Gene

A nucleic acid may comprise both a recombinant methyltransferase gene and a recombinant reductase gene. The recombinant methyltransferase gene and the recombinant reductase gene may encode proteins from the same species or from different species.

A nucleic acid may comprise the nucleotide sequence of an expression vector comprising a tmp operon that includes both a methyltransferase gene and a reductase gene. Such vectors may include pNC1071 (SEQ ID NO:39), which includes a Desulfobacter postgatei tmp operon; pNC1072 (SEQ ID NO:40), which includes a Desulfobacula balticum tmp operon, pNC1073 (SEQ ID NO:41), which includes a Desulfobacula toluolica tmp operon; pNC1074 (SEQ ID NO:42), which includes a Marinobacter hydrocarbonclasticus tmp operon; and pNC1076 (SEQ ID NO:43), which includes a Thiohalospira halophila tmp operon.

In some embodiments, the nucleic acid encodes a fusion protein that includes both a methyltransferase and a reductase or fragments thereof. In the context of the present invention, “fusion protein” means a single protein molecule containing two or more distinct proteins or fragments thereof, covalently linked via peptide bond in a single peptide chain. In some embodiments, the fusion protein comprises enzymatically active domains from both a methyltransferase protein and a reductase protein. The nucleic acid may further encode a linker peptide between the methyltransferase and the reductase. In some embodiments, the linker peptide comprises the amino acid sequence AGGAEGGNGGGA (SEQ ID NO:44). The linker may comprise about or at least about 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, or 30 amino acids, or any range derivable therein. The nucleic acid may comprise any of the methyltransferase and reductase genes described herein, and the fusion protein encoded by the nucleic acid can comprise any of the methyltransferase and reductase proteins described herein, including biologically active fragments thereof. In some embodiments, the fusion protein is a tmpA-B protein, in which the tmpA protein is closer to the N-terminus than the tmpB protein.

3. Compositions

Various aspects of the invention relate to compositions produced by the cells described herein. The composition may be an oil composition comprised of about, at least about, or at most about 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, 100% lipids by weight. The composition may comprise branched (methyl)lipids and/or exomethylene-substituted lipids. The branched (methyl)lipid may be a carboxylic acid (e.g., 10-methylstearic acid, 10-methylpalmitic acid, 12-methyloleic acid, 13-methyloleic acid, 10-methyl-octadec-12-enoic acid), carboxylate (e.g., 10-methylstearate, 10-methylpalmitate, 12-methyloleate, 13-methyloleate, 10-methyl-octadec-12-enoate), ester (e.g., diacylglycerol, triacylglycerol, phospholipid), thioester (e.g., 10-methylstearyl CoA, 10-methylpalmityl CoA, 12-methyloleoyl CoA, 13-methyloleoyl CoA, 10-methyl-octadec-12-enoyl CoA), or amide. The exomethylene-substituted lipid may be a carboxylic acid (e.g., 10-methylenestearic acid, 10-methylenepalmitic acid, 12-methyleneoleic acid, 13-methyleneoleic acid, 10-methylene-octadec-12-enoic acid), carboxylate (e.g., 10-methylenestearate, 10-methylenepalmitate, 12-methyleneoleate, 13-methyleneoleate, 10-methylene-octadec-12-enoate), ester (e.g., diacylglycerol, triacylglycerol, phospholipid), thioester (e.g., 10-methylenestearyl CoA, 10-methylenepalmityl CoA, 12-methyleneoleoyl CoA, 13-methyleneoleoyl CoA, 10-methylene-octadec-12-enoyl CoA), or amide. 10-methyl lipids, 10-methylene lipids, or both. It is specifically contemplated that one or more of the above lipids may be excluded from certain embodiments.

In some aspects, the composition is produced by cultivating a culture comprising any of the cells described herein and recovering the oil composition from the cell culture. The cells in the culture may contain any of the recombinant methyltransferase genes described herein and/or any of the recombinant reductase genes described herein. The culture medium and conditions can be chosen based on the species of the cell to be cultured and can be optimized to provide for maximal production of the desired lipid profile.

Various methods are known for recovering an oil composition from a culture of cells. For example, lipids, lipid derivatives, and hydrocarbons can be extracted with a hydrophobic solvent such as hexane. Lipids and lipid derivatives can also be extracted using liquefaction, oil liquefaction, and supercritical CO₂extraction. The recovery process may include harvesting cultured cells, such as by filtration or centrifugation, lysing cells to create a lysate, and extracting the lipid/hydrocarbon components using a hydrophobic solvent.

In addition to accumulating within cells, the lipids described herein may be secreted by the cells. In that case, a process for recovering the lipid may not require creating a lysate from the cells, but collecting the secreted lipid from the culture medium. Thus, the compositions described herein may be made by culturing a cell that secretes one of the lipids described herein, such as a linear fatty acid with a chain length of 14-18 carbons with a methyl branch at the Δ9, Δ10, or Δ11 position.

In some embodiments, the oil composition comprises about, at least about, or at most about 1%, 2%, 3%, 4%, 5%, 6%, 7%, 8%, 9%, 10%, 11%, 12%, 13%, 14%, 15%, 16%, 17%, 18%, 19%, 20%, 21%, 22%, 23%, 24%, 25%, 26%, 27%, 28%, 29%, 30%, 31%, 32%, 33%, 34%, 35%, 36%, 37%, 38%, 39%, 40%, 41%, 42%, 43%, 44%, 45%, 46%, 47%, 48%, 49%, 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% by weight of a branched (methyl)lipid, such as a 10-methyl fatty acid, or any range derivable therein. In some embodiments, 10-methyl fatty acids comprise about, at least about, or at most about 1%, 2%, 3%, 4%, 5%, 6%, 7%, 8%, 9%, 10%, 11%, 12%, 13%, 14%, 15%, 16%, 17%, 18%, 19%, 20%, 21%, 22%, 23%, 24%, 25%, 26%, 27%, 28%, 29%, 30%, 31%, 32%, 33%, 34%, 35%, 36%, 37%, 38%, 39%, 40%, 41%, 42%, 43%, 44%, 45%, 46%, 47%, 48%, 49%, 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% by weight of the fatty acids in the composition, or any range derivable therein.

The amount of 10-methyl fatty acids in a cell can be optimized by various methods. For example, increasing the expression of tmpA and/or tmpB can increase the methyltransferase and/or reductase activity within the cell, which may lead to accumulation of greater amounts of branched (methyl lipids). One way this can be accomplished is by increasing the number of copies of the gene in the cell, such as by including the genes on high-copy-number plasmids. Additionally or alternatively, the tmpA and/or tmpB cells can be operably linked to a promoter that drives high levels of expression.

4. Methods of Producing Branched (Methyl)Lipid

Various aspects of the invention relate to a method of producing a branched (methyl)lipid. The method may comprise incubating a cell or plurality of cells as described herein, supra, with media. The media may optionally be supplemented with an unbranched, unsaturated fatty acid, such as oleic acid, that serves as a substrate for methylation. The substrate may include one or more fatty acids from 14 to 18 carbons long with a double bond in the Δ9, Δ10, or Δ11 position. The substrate may be 14, 15, 16, 17, or 18 carbons long, or any range derivable therein. The media may optionally be supplemented with methionine or s-adenosyl methionine, which may similarly serve as a substrate. Thus, the method may comprise contacting a cell or plurality of cells with oleic acid (or some other substrate to be methylated), methionine, or both. The method may comprise incubating a cell or plurality of cells as described herein, supra, in a bioreactor. The method may comprise recovering lipids from the cells, such as by extraction with an organic solvent.

The method may comprise degumming the cell or plurality of cells, e.g., to remove proteins. The method may comprise transesterification or esterification of the lipids of the cells. An alcohol such as methanol or ethanol may be used for transesterification or esterification, e.g., thereby producing a fatty acid methyl ester or fatty acid ethyl ester.

EXAMPLES

The present invention will be described in greater detail by way of specific examples. The following examples are offered for illustrative purposes only, and are not intended to limit the invention in any manner. Those of skill in the art will readily recognize a variety of noncritical parameters which can be changed or modified to yield essentially the same results.

Example 1

Identification of tmpB and tmpA Genes in Gammaproteobacteria

Select Gammaproteobacteria are known to produce branched 10-methyl fatty acids. The acetate-oxidizing, sulfate-reducing Desulfobacter bacteria were reported to produce 10-methylhexadecanoic acid at 6%-24% of total phospholipid-ester linked fatty acid content (Dowling, Microbiology 132:1815-25 (1986)). Other reports of 10-methyl branched fatty acid production exist for bacteria in the Genus Marinobacter (Márquez, J. Syst. Evol. Microbiol. 55:1349-51 (2005); Huu, Int. J. Syst. Evol. Microbiol. 49:367-75 (1999); Gauthier, Int. J. Syst. Evol. Microbiol. 42:568-76 (1992); Thiohalospira (Sorokin, Int. J. Syst. Evol. Microbiol., 58:2890-97 (2008)), Thiohalorhabdus (Sorokin, Int. J. Syst. Evol. Microbiol. 58:2890-97 (2008)), Desulfobacula, and Desulfotignum (Kuever, Int. J. Syst. Evol. Microbiol. 51:171-77 (2001)). However, no genes or enzymes involved in Gammaproteobacteria 10-methyl fatty acid production have been described. In this Example, a pair of phylogenetically and sequence-homology distinct genes present in certain Gammaproteobacteria which direct production of 10-methyl fatty acids in heterologous hosts are described.

A list of Gammaproteobacteria that produce 10-methyl fatty acids and have sequenced genomes was compiled from literature reports. Additionally, representative Gammaproteobacteria that are not reported to produce 10-methyl fatty acids were included for comparison. According to a biochemical study on the unrelated bacterium Mycobacterium phlei using unpurified enzyme preparations, the first step of 10-methyl fatty acid synthesis occurs via a mechanism similar to cyclopropane fatty acid synthesis and is followed by an enzymatic reduction step (Akamatsu, J. Biol. Chem. 245:701-08 (1970)). To find gene candidates responsible for 10-methyl fatty acid production the Gammaproteobacteria genomes were scanned for homologs of E. coli cyclopropane fatty acyl phospholipid synthase (cfa), which is responsible for methylation of unsaturated fatty acids to produce cyclopropane fatty acids (Wang, Biochemistry 31:11020-28 (1992); Taylor, Biochemistry 18:3292-3300 (1979)). This was performed using the NCBI BLAST protein analysis tool and the BioCyc genomic database (Caspi, Nucleic Acids Res. 40:D742-53 (2016)). Next, the cfa homologs were scanned for adjacent genes in an operon structure that had homology to an oxidoreductase or electron transfer function. Interestingly, Gammaproteobacteria able to produce 10-methyl fatty acids all possessed a gene operon (referred to herein as the tmp operon) with a cyclopropane fatty acid synthase gene homolog (referred to herein as tmpB) and a gene with homology to a geranylgeranyl reductase (referred to herein as tmpA). These results are summarized in FIG. 2. It is unlikely tmpA is a true geranylgeranyl reductase since the enzyme is involved in chlorophyll and tocopherol biosynthesis, neither of which chemicals the bacteria produce.

Example 2

E. coli Expression of the tmpB and tmpA Gene Products

To test if the tmp gene operon was responsible for Gammaproteobacteria 10-methyl fatty acid production, the genes were designed in an E. coli expression vector using the DNA manipulation software A Plasmid Editor and synthesized by Thermofisher Scientific-GeneArt. The native codon usage of the tmp genes was not changed. tmpB gene transcription was controlled using the constitutively active tac promoter (de Boer 1983), followed by the E. coli lacZ-lacY intergene linker region, the tmpA gene, and the trpT′ gene terminator (Wu 1981). These synthetic gene operons were cloned into an E. coli expression vector containing the AmpR ampicillin resistance gene and the ColE1 origin of replication (FIG. 3A-3B). The plasmid vectors are named pNC1071 (SEQ ID NO:39), which includes the Desulfobacter postgatei tmp operon; pNC1072 (SEQ ID NO:40), which includes the Desulfobacula balticum tmp operon; pNC1073 (SEQ ID NO:41), which includes the Desulfobacula toluolica tmp operon; pNC1074 (SEQ ID NO:42), which includes the Marinobacter hydrocarbonclasticus tmp operon; and pNC1076 (SEQ ID NO:43), which includes the Thiohalospira halophila tmp operon.

Plasmids pNC1071, pNC1072, pNC1073, pNC1074, pNC1076, and the control plasmid pNC53 containing the AmpR gene, ColE1 origin, and tac promoter were transformed into E. coli Top10 (Invitrogen) using a standard electrotransformation protocol utilizing 50 μL suspended cells, 1 μL of plasmid DNA at a concentration of 200 ng per μL, a 1 mm gap electrotransformation cuvette, and a pulse with 1.8 kV voltage, 200Ω, and 25 μF with exponential decay and a time constant of approximately 4.5 milliseconds. During the protocol cells were kept on ice and the cuvette was pre-chilled before pulsing with a Bio-Rad Gene Pulser Electroporation System. After pulsing, cells were transferred to 1 mL SOC medium and incubated at 37° C. for 1 hour before plating on LB agar containing 100 μg per mL ampicillin antibiotic.

Single colonies from the transformation plates were chosen and grown in 5 mL LB liquid media in 14 mL plastic falcon tubes overnight at 37° C. These were used to prepare freezer vials with 0.75 mL culture broth and 0.75 mL of 50% glycerol/water which were stored at −80° C.

Fermentation studies were performed in 50 mL LB media with 100 μg per mL ampicillin in 250 mL baffled shake flasks. 10 μL of frozen culture stock was added to the media and the flask was incubated at 37° C. and shaken at 200 rpm in a New Brunswick orbital incubator for 24 hours. Cell were harvested by centrifugation at 4000 rpm for 15 minutes in an Eppendorf 5810 R clinical centrifuge, resuspended in 0.5 mL deionized water, and frozen at −80° C.

FIG. 6 shows that E. coli transformed with pNC1071, pNC1073, pNC1074, and pNC1076, but not the empty vector control (pNC53) produced 10-methylene hexadecenoic acid.

To test the acyl chain substrate range for the tmpB and tmpA enzymes, E. coli transformed with pNC1074 (M. hydrocarbonclausticus tmp operon) or pNC1076 (T. halophila tmp operon) were grown in LB media supplemented with ampicillin and 100 mg/L of one of the fatty acids indicated in Table 1 below. After culturing, cells were harvested by centrifugation, washed with deionized water, resuspended in deionized water, and frozen. Cells were then lyophilized to dryness and used to perform a HCl-methanol catalyzed transesterification reaction to produce fatty acid methyl esters (FAME). These samples were dissolved in isooctane and injected into a gas chromatography system (Agilent Technologies) equipped with a flame ionization detector. Table 1 shows the percentage of each fatty acid that was converted to methylene- and methyl-branched fatty acids.

TABLE 1

Fatty acid conversion to methylene and methyl branched fatty acids
with E. coli expressing the tmpB and tmpA genes from
M. hydrocarbonclasticus and T. halophila.
	E. coli + pNC1074 (M.	E. coli + pNC1076 (T.
Fatty	hydrocarbonclausticus	halophila) tmpBA
acid	tmpBA) percent conversion	percent conversion

12:1Δ10	0%	0%
13:1Δ12	0%	0%
14:1Δ9	89%	95%
15:1Δ10	86%	69%
16:1Δ9	55%	95%
17:1Δ10	36%	19%
18:1Δ6	0%	0%
18:1Δ9	42%	47%
18:1Δ11	9%	8%
19:1Δ7	0%	0%
19:1Δ10	0%	0%
20:1Δ5	0%	0%
20:1Δ8	0%	0%
20:1Δ11	0%	0%
22:1Δ13	0%	0%
24:1Δ15	0%	0%

As shown in Table 1, methylation occurred on fatty acids with 14, 15, 16, 17, and 18 carbons, and on Δ9, Δ10, and Δ11 double bond positions.

Example 3

tmpB Gene Expression in Yeast

To test the production of 10-methylene fatty acids by the tmpB genes in the yeast Saccharomyces cerevisiae and Yarrowia lipolytica, the genes containing native bacterial codons were cloned into a standard Yarrowia overexpression vector. The vector contains a selectable NAT marker and a 2μ origin of replication for high copy maintenance in Saccharomyces cerevisiae. The resulting plasmids are pNC996 (Desulfobacter postgatei tmpB), pNC998 (Desulfobacula balticum tmpB), pNC1000 (Desulfobacula toluolica tmpB), pNC1002 (Marinobacter hydrocarbonclasticus tmpB), pNC1006 (Thiohalospira halophila tmpB). For Saccharomyces, plasmids were transformed into NS20 by standard heat shock protocol. Single cells of the resulting transformations were selected and further grown in 96-well shaking plates in YPD supplemented with 50 μg/mL Nourseothrycin for 2 days at 30° C. For Yarrowia, plasmids were transformed into strain NS1009. Resulting transformed strains were grown in 96-well shaking plates in standard nitrogen limited media for 4 days at 30° C. For all yeast experiments, cell pellets were isolated by centrifugation and freeze dried for fatty acid analysis by gas chromatography as performed for E. coli samples. Total fatty acids were measured and the total amount of C16 and C18 fatty acids containing the methylene intermediates were quantified.

Results: Three tmpB genes produced 10-methylene fatty acids in NS20, Desulfobacula balticum, Marinobacter hydrocarbonclasticus, and Thiohalospira halophila (FIG. 4). The tmpB genes from Marinobacter hydrocarbonclasticus, and Thiohalospira halophila were able to produce 10-methylene fatty acids in Yarrowia lipolytica (FIG. 5).

Example 4

tmpB and tmpA Sequence Analysis

TmpB protein sequences encoded by the tmpB genes from Desulfobacula balticum, Marinobacter hydrocarbonclasticus, Thiohalospira halophila, Desulfobacter curvatus, Desulfobacter phenolica, Desulfobacula toluolica, Desulfobacter postgatei, Halofilum ochraceum, and Marinobacter aquaeolei were aligned with the cyclopropane fatty acid synthase (Cfa) enzyme from Escherichia coli with the CLUSTAL OMEGA software program (European Molecular Biology Laboratory, EMBL). FIGS. 7A-D show the alignment of these protein sequences and indicates a number of amino acids that are conserved in the tmsB protein sequences but not in the E. coli Cfa sequence. The following amino acids are conserved in the TmpB aligned proteins, but not present in the E. coli Cfa protein: Y163, T175, R199, E211, G269, Y271, N313, N319, W389 (amino acid number based on the M. hydrocarbonclasticus TmpB protein). The percent sequence identity of each of the aligned proteins as compared to M. hydrocarbonclasticus tmpB is indicated below:


		% Identity of amino acid sequence


	Desulfobacula balticum TmpB	37%
	Thiohalospira halophila TmpB	58%
	Desulfobacter curvatus TmpB	43%
	Desulfobacter phenolica TmpB	39%
	Desulfobacula toluolica TmpB	39%
	Desulfobacter postgatei TmpB	43%
	Halofilum ochraceum TmpB	59%
	Marinobacter aquaeolei TmpB	88%
	Escherichia coli Cfa	46%

TmpA protein sequences encoded by the tmpA genes from Desulfobacula balticum, Marinobacter hydrocarbonclasticus, Thiohalospira halophila, Desulfobacter curvatus, Desulfobacter phenolica, Desulfobacula toluolica, Desulfobacter postgatei, Halofilum ochraceum, and Marinobacter aquaeolei were aligned with the Archaeoglobus fulgidus geranylgeranyl reductase protein AF0464 with the CLUSTAL OMEGA software program (European Molecular Biology Laboratory, EMBL). FIGS. 8A-D show the alignment of these protein sequences and indicates a number of amino acids that are conserved in the tmsA protein sequences but not in the Archaeoglobus fulgidus geranylgeranyl reductase protein AF0464. The following amino acids are conserved in the TmpA aligned proteins, but not present in the Archaeoglobus fulgidus geranylgeranyl reductase protein AF0464: I8, L22, F37, P38, R39, K41, G45, W46, P49, G144, C148, P149, E169, E171, L197, I212, C249, H250, Y252, I270, G275, L276, E283, A296, A299 (amino acid number based on the M. hydrocarbonclasticus TmpA protein).


	% Identity of amino acid sequence

Desulfobacula balticum TmpA	33%
Thiohalospira halophila TmpA	57%
Desulfobacter curvatus TmpA	36%
Desulfobacter phenolica TmpA	34%
Desulfobacula toluolica TmpA	34%
Desulfobacter postgatei TmpA	34%
Halofilum ochraceum TmpA	64%
Marinobacter aquaeolei TmpA	83%
Archaeoglobus fulgidus AF0464	27%

INVENTORS:

Blitzblau, Hannah, Crabtree, Donald V., Shaw, Arthur J.

THIS PATENT IS REFERENCED BY THESE PATENTS:

Patent

Priority

Assignee

Title

THIS PATENT REFERENCES THESE PATENTS:

Patent	Priority	Assignee	Title
10457963,	Sep 20 2016	GINKGO BIOWORKS, INC	Heterologous production of 10-methylstearic acid
20020120958,
WO2018057607,

ASSIGNMENT RECORDS Assignment records on the USPTO

//////

Executed on	Assignor	Assignee	Conveyance	Frame	Reel	Doc
Jun 29 2012	CRABTREE, DONALD	NOVOGY, INC	ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS	058000	0379	pdf
Jul 19 2012	SHAW, JOE	NOVOGY, INC	ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS	058002	0400	pdf
Jul 16 2015	BLITZBLAU, HANNAH	NOVOGY, INC	ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS	058000	0395	pdf
Sep 20 2018		Ginkgo Bioworks, Inc.	(assignment on the face of the patent)
Oct 15 2020	NOVOGY, INC	GINKGO BIOWORKS, INC	ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS	054142	0171	pdf
Oct 15 2020	NOVOGY, INC	GINKGO BIOWORKS, INC	CORRECTIVE ASSIGNMENT TO CORRECT THE RECEIVING PARTY NAME PREVIOUSLY RECORDED AT REEL: 054142 FRAME: 0171 ASSIGNOR S HEREBY CONFIRMS THE ASSIGNMENT	054451	0918	pdf

MAINTENANCE FEES AND DATES: Maintenance records on the USPTO

Date	Maintenance Fee Events
Mar 17 2020	BIG: Entity status set to Undiscounted (note the period is included in the code).

Date	Maintenance Schedule
Feb 01 2025	4 years fee payment window open
Aug 01 2025	6 months grace period start (w surcharge)
Feb 01 2026	patent expiry (for year 4)
Feb 01 2028	2 years to revive unintentionally abandoned end. (for year 4)
Feb 01 2029	8 years fee payment window open
Aug 01 2029	6 months grace period start (w surcharge)
Feb 01 2030	patent expiry (for year 8)
Feb 01 2032	2 years to revive unintentionally abandoned end. (for year 8)
Feb 01 2033	12 years fee payment window open
Aug 01 2033	6 months grace period start (w surcharge)
Feb 01 2034	patent expiry (for year 12)
Feb 01 2036	2 years to revive unintentionally abandoned end. (for year 12)