Title:  Use of Bioinformatic Tools to Study Myostatin Genes and Proteins

 

Author:  Jason B. King, Biology

              Grambling State University

 

Faculty Research Advisor:  Sandra Rodriguez-Zas, Animal Sciences, University of Illinois at Urbana-Champaign

Team Leader: Dawn Williams

Summer Research Opportunities Program University of Illinois at Urbana-Champaign

 

 

       Growth Factors (GFs) are proteins that can stimulate or inhibit cell proliferation. The growth factors can be grouped based on amino acid sequence and tertiary structure. A large group of GFs is the transforming growth factor beta (TGFb )  superfamily which exert multiple effects on cell function and are extensively expressed.  Among the TGFb, the growth and differentiation factors (GDFs) regulate cell growth and differentiation.  In particular, the GDF-8, also called myostatin, is a skeletal muscle protein (regulates the actions of that muscle) asociated with the double muscling in mice  and cattle.   Myostatin is a protein that encodes a gene that regulated muscle growth. 

       McPherron and other researcher investigated the presence of myostatin gene mutations in mice.  The gene that


encodes myostatin protein in rats was disrupted and the animals had a nonfunctional gene for producing myostatin. The mutant mice “gene knockout mice”  were 30% than their couterparts who had the gene for myostatin.  Futher studies revealed that some breeds of cattle had this same mutant gene.  Studies show that a deletion in the bovine myostati gene causes the double-muscled phenotype in cattle. Scientist have also reported the sequences for myostatin in 9 other vertebrate animals, including pigs, chickens and humans.

 

 


        Research is also being conducted to see the effects of myostatin in humans.  One study conducted by Ferrel l and  colleagues investigated the variations in the human myostatin gene sequence.  Also, the influence of myostatin variations in response of muscle mass to strength training was examined.   The revealed that there was not any significant response between genotypes and response to weight training.  However, there will be future research  to ascertain benefit to human health.The Biology Workbench is a computational interface and environment that permits anyone with a Web browser to readily use bioinformatics, for research, teaching, or learning. It consists of a set of scripts that links the user's Browser to a collection of information sources (databases) and application programs. The scripts are specialized for the interface of each program and information source. Functionally they transform the interface for each object, whether database or application program, into a common Web-based form that permits them to be seamlessly interconnected. The Biology Workbench is a significant bioinformatics resource that provides a suite of interactive tools which draw on a host of biology databases and allows users to compare molecular sequences using high performance computing facilities, visualize and manipulate molecular structures, and generate phylogenetic hypotheses.

       Biology is sometimes called an "information-driven" science. This means that the raw material of biology is information, the results of experiments in the laboratory and observations in the field. In this view, the science of biology is all about constructing meaning from the information.    In the last couple of decades, there has been a technical revolution in molecular biology, which has made it possible to get enormous amounts of information about the sequences of amino acids in proteins and the sequences of bases in nucleic acids. To construct meaning from the sequence information, the array of computer techniques called "bioinformatics" has been developed. The components of bioinformatics are sequence databases, and computer programs that analyze the sequences for patterns and similarities. Now, bioinformatics is showing us many things about what sequences mean. Using bioinformatics, sequences are being used to reveal relationships among different life forms that we could not find out any other way. Bioinformatics is revealing the rules and meaning of a language that is new to human beings but in fact is a billion years old-the Language of Life.

        The objective of this research project is to apply bioinformatic techniques to DNA and RNA amino acid sequences available in public databases to better understand Biological processes.  The procedure involves the use of a suite of computational Biology tools to analyze the action of genes and associated proteins.  The expected outcomes are a set of educational modules that can be used in K-12 and undergraduate teaching of Genetics, Biochemistry, and Mathematics.

 

 

 

 

      The Biology Workbench is a computational interface and environment that permits anyone with a Web browser to readily use bioinformatics, for research, teaching, or learning. It consists of a set of scripts that links the user's Browser to a collection of information sources (databases) and application programs. The scripts are specialized for the interface of each program and information source. Functionally they transform the interface for each object, whether database or application program, into a common Web-based form that permits them to be seamlessly interconnected. The Biology Workbench is a significant bioinformatics resource that provides a suite of interactive tools which draw on a host of biology databases and allows users to compare molecular sequences using high performance computing facilities, visualize and manipulate molecular structures, and generate phylogenetic hypotheses.

       Biology is sometimes called an "information-driven" science. This means that the raw material of biology is information, the results of experiments in the laboratory and observations in the field. In this view, the science of biology is all about constructing meaning from the information.    In the last couple of decades, there has been a technical revolution in molecular biology, which has made it possible to get enormous amounts of information about the sequences of amino acids in proteins and the sequences of bases in nucleic acids. To construct meaning from the sequence information, the array of computer techniques called "bioinformatics" has been developed. The components of bioinformatics are sequence databases, and computer programs that analyze the sequences for patterns and similarities. Now, bioinformatics is showing us many things about what sequences mean. Using bioinformatics, sequences are being used to reveal relationships among different life forms that we could not find out any other way. Bioinformatics is revealing the rules and meaning of a language that is new to human beings but in fact is a billion years old-the Language of Life.

        The objective of this research project is to apply bioinformatic techniques to DNA and RNA amino acid sequences available in public databases to better understand Biological processes.  The procedure involves the use of a suite of computational Biology tools to analyze the action of genes and associated proteins.  The expected outcomes are a set of educational modules that can be used in K-12 and undergraduate teaching of Genetics, Biochemistry, and Mathematics.

 

 

 

 

 

A Search for Myostatin and GDF8 was conducted  SWISSPORT,a protein sequence database; PIR, a protein Information resource; Prints, a protein motif fingerprint database, and Blocks, a multiple alignment of conserved region of protein families.  The search yielded 9 sequences that were imported into the Biology Workbench for further analysis and interpretation.

 

The sequences retrieved from the protein databases were:

 

     SWISSPROT:GDF8_HUMAN

                                        O14793. GROWTH/DIFFERENTIATION FACTOR 8

                                        PRECURSOR (GDF-8) (MYOSTATIN). Homo sapiens (Human).

    

     SWISSPROT:GDF8_BOVIN

                                        O18836. GROWTH/DIFFERENTIATION FACTOR 8

                                        PRECURSOR (GDF-8) (MYOSTATIN). Bos taurus (Bovine).

    

     SWISSPROT:GDF8_PIG

                                        O18831. GROWTH/DIFFERENTIATION FACTOR 8

                                        PRECURSOR (GDF-8) (MYOSTATIN). Sus scrofa (Pig).

    

     SWISSPROT:GDF8_SHEEP

                                        O18830. GROWTH/DIFFERENTIATION FACTOR 8

                                        PRECURSOR (GDF-8) (MYOSTATIN). Ovis aries (Sheep).

    

     SWISSPROT:GDF8_RAT

                                        O35312. GROWTH/DIFFERENTIATION FACTOR 8

                                        PRECURSOR (GDF-8) (MYOSTATIN). Rattus norvegicus (Rat).

    

     SWISSPROT:GDF8_PAPHA

                                        O18828. GROWTH/DIFFERENTIATION FACTOR 8

                                        PRECURSOR (GDF-8) (MYOSTATIN). Papio hamadryas

                                        (Hamadryas baboon).

    

     SWISSPROT:GDF8_MELGA

                                        O42221. GROWTH/DIFFERENTIATION FACTOR 8

                                        PRECURSOR (GDF-8) (MYOSTATIN). Meleagris gallopavo

                                        (Common turkey).

    

     SWISSPROT:GDF8_CHICK

                                        O42220. GROWTH/DIFFERENTIATION FACTOR 8

                                        PRECURSOR (GDF-8) (MYOSTATIN). Gallus gallus (Chicken).

    

 

     SWISSPROT:GDF8_BRARE

                                        O42222. GROWTH/DIFFERENTIATION FACTOR 8

                                        PRECURSOR (GDF-8) (MYOSTATIN). Brachydanio rerio

                                        (Zebrafish) (Zebra danio).

    

     SWISSPROT:GDF8_MOUSE

                                        O08689. GROWTH/DIFFERENTIATION FACTOR 8

                                        PRECURSOR (GDF-8) (MYOSTATIN). Mus musculus (Mouse).

 

 

CLUSTALW, a multiple sequence alignment protein tool was used to compare the protein sequences.  This protein tool aligns two or more sequences by maximizing the similarity between paired amino acid positions. 

 

CLUSTAL W (1.81) multiple sequence alignment

 

GDF8_MELGA      -MQKLAVYVYIYLFMQILVHPV---ALDGSSQPTENAEKDGLCNACTWRQNTKSSRIEAI

GDF8_CHICK      -MQKLAVYVYIYLFMQIAVDPV---ALDGSSQPTENAEKDGLCNACTWRQNTKSSRIEAI

GDF8_HUMAN      -MQKLQLCVYIYLFMLIVAGPV---DLNENSEQKENVEKEGLCNACTWRQNTKSSRIEAI

GDF8_PAPHA      -MQKLQLCVYIYLFMLIVAGPV---DLNENSEQKENVEKEGLCNACTWRQNTKSSRIEAI

GDF8_BOVIN      -MQKLQISVYIYLFTLIVAGPV---DLNENSEQKENVEKEGLCNACLWRENTTSSRLEAI

GDF8_SHEEP      -MQKLQIFVYIYLFMLLVAGPV---DLNENSEQKENVEKKGLCNACLWRQNNKSSRLEAI

GDF8_PIG        -MQKLQIYVYIYLFMLIVAGPV---DLNENSEQKENVEKEGLCNACMWRQNTKSSRLEAI

GDF8_RAT        MIQKPQMYVYIYLFVLIAAGPV---DLNEDSEREANVEKEGLCNACAWRQNTRYSRIEAI

GDF8_MOUSE      MMQKLQMYVYIYLFMLIAAGPV---DLNEGSEREENVEKEGLCNACAWRQNTRYSRIEAI

GDF8_BRARE      ---MHFTQVLISLSVLIACGPVGYGDITAHQQPSTATEESELCSTCEFRQHSKLMRLHAI

                        * * *   :   **    :   .:    .*:. **.:* :*::.   *:.**

 

GDF8_MELGA      KIQILSKLRLEQAPNISRDVIKQLLPKAPPLQELIDQYDVQRDDSSDGSLEDDDYHATTE

GDF8_CHICK      KIQILSKLRLEQAPNISRDVIKQLLPKAPPLQELIDQYDVQRDDSSDGSLEDDDYHATTE

GDF8_HUMAN      KIQILSKLRLETAPNISKDVIRQLLPKAPPLRELIDQYDVQRDDSSDGSLEDDDYHATTE

GDF8_PAPHA      KIQILSKLRLETAPNISKDAIRQLLPKAPPLRELIDQYDVQRDDSSDGSLEDDDYHATTE

GDF8_BOVIN      KIQILSKLRLETAPNISKDAIRQLLPKAPPLLELIDQFDVQRDASSDGSLEDDDYHARTE

GDF8_SHEEP      KIQILSKLRLETAPNISKDAIRQLLPKAPPLRELIDQYDVQRDDSSDGSLEDDDYHVTTE

GDF8_PIG        KIQILSKLRLETAPNISKDAIRQLLPKAPPLRELIDQYDVQRDDSSDGSLEDDDYHATTE

GDF8_RAT        KIQILSKLRLETAPNISKDAIRQLLPRAPPLRELIDQYDVQRDDSSDGSLEDDDYHATTE

GDF8_MOUSE      KIQILSKLRLETAPNISKDAIRQLLPRAPPLRELIDQYDVQRDDSSDGSLEDDDYHATTE

GDF8_BRARE      KSQILSKLRLKQAPNISRDVVKQLLPKAPPLQQLLDQYDVLGDDSKDGAVEEDDEHATTE

                * ********: *****:*.::****:**** :*:**:**  * *.**::*:** *. **

 

GDF8_MELGA      TIITMPTESDFLVQMEGKPKCCFFKFSSKIQYNKVVKAQLWIYLRQVQKPTTVFVQILRL

GDF8_CHICK      TIITMPTESDFLVQMEGKPKCCFFKFSSKIQYNKVVKAQLWIYLRQVQKPTTVFVQILRL

GDF8_HUMAN      TIITMPTESDFLMQVDGKPKCCFFKFSSKIQYNKVVKAQLWIYLRPVETPTTVFVQILRL

GDF8_PAPHA      TIITMPTESDFLMQVDGKPKCCFFKFSSKIQYNKVVKAQLWIYLRPVETPTTVFVQILRL

GDF8_BOVIN      TVITMPTESDLLTQVEGKPKCCFFKFSSKIQYNKLVKAQLWIYLRPVKTPATVFVQILRL

GDF8_SHEEP      TVITMPTESDLLAEVQEKPKCCFFKFSSKIQHNKVVKAQLWIYLRPVKTPTTVFVQILRL

GDF8_PIG        TIITMPTESDLLMQVEGKPKCCFFKFSSKIQYNKVVKAQLWIYLRPVKTPTTVFVQILRL

GDF8_RAT        TIITMPTESDFLMQADGKPKCCFFKFSSKIQYNKVVKAQLWIYLRAVKTPTTVFVQILRL

GDF8_MOUSE      TIITMPTESDFLMQADGKPKCCFFKFSSKIQYNKVVKAQLWIYLRPVKTPTTVFVQILRL

GDF8_BRARE      TIMTMATEPDPIVQVDRKPKCCFFSFSPKIQANRIVRAQLWVHLRPAEEATTVFLQISRL

                *::**.**.* : : : *******.**.*** *::*:****::** .: .:***:** **

 

GDF8_MELGA      IKPMKDGTRYTGIRSLKLDMNPGTGIWQSIDVKTVLQNWLKQPESNLGIEIKAFDENGRD

GDF8_CHICK      IKPMKDGTRYTGIRSLKLDMNPGTGIWQSIDVKTVLQNWLKQPESNLGIEIKAFDETGRD

GDF8_HUMAN      IKPMKDGTRYTGIRSLKLDMNPGTGIWQSIDVKTVLQNWLKQPESNLGIEIKALDENGHD

GDF8_PAPHA      IKPMKDGTRYTGIRSLKLDMNPGTGIWQSIDVKTVLQNWLKQPESNLGIEIKALDENGHD

GDF8_BOVIN      IKPMKDGTRYTGIRSLKLDMNPGTGIWQSIDVKTVLQNWLKQPESNLGIEIKALDENGHD

GDF8_SHEEP      IKPMKDGTRYTGIRSLKLDMNPGTGIWQSIDVKTVLQNWLKQPESNLGIEIKALDENGHD

GDF8_PIG        IKPMKDGTRYTGIRSLKLDMNPGTGIWQSIDVKTVLQNWLKQPESNLGIEIKALDENGHD

GDF8_RAT        IKPMKDGTRYTGIRSLKLDMSPGTGIWQSIDVKTVLQNWLKQPESNLGIEIKALDENGHD

GDF8_MOUSE      IKPMKDGTRYTGIRSLKLDMSPGTGIWQSIDVKTVLQNWLKQPESNLGIEIKALDENGHD

GDF8_BRARE      M-PVKDGGRHR-IRSLKIDVNAGVTSWQSIDVKQVLTVWLKQPETNRGIEINAYDAKGND

                : *:*** *:  *****:*:..*.  ******* **  ******:* ****:* * .*.*

 

GDF8_MELGA      LAVTFPGPGEDGLNPFLEVRVTDTPKRSRRDFGLDCDEHSTESRCCRYPLTVDFEAFGWD

GDF8_CHICK      LAVTFPGPGEDGLNPFLEVRVTDTPKRSRRDFGLDCDEHSTESRCCRYPLTVDFEAFGWD

GDF8_HUMAN      LAVTFPGPGEDGLNPFLEVKVTDTPKRSRRDFGLDCDEHSTESRCCRYPLTVDFEAFGWD

GDF8_PAPHA      LAVTFPGPGEDGLNPFLEVKVTDTPKRSRRDFGLDCDEHSTESRCCRYPLTVDFEALGWD

GDF8_BOVIN      LAVTFPEPGEDGLTPFLEVKVTDTPKRSRRDFGLDCDEHSTESRCCRYPLTVDFEAFGWD

GDF8_SHEEP      LAVTFPEPGEEGLNPFLEVKVTDTPKRSRRDFGLDCDEHSTESRCCRYPLTVDFEAFGWD

GDF8_PIG        LAVTFPGPGEDGLNPFLEVKVTDTPKRSRRDFGLDCDEHSTESRCCRYPLTVDFEAFGWD

GDF8_RAT        LAVTFPGPGEDGLNPFLEVKVTDTPKRSRRDFGLDCDEHSTESRCCRYPLTVDFEAFGWD

GDF8_MOUSE      LAVTFPGPGEDGLNPFLEVKVTDTPKRSRRDFGLDCDEHSTESRCCRYPLTVDFEAFGWD

GDF8_BRARE      LAVTSTETGEDGLLPFMEVKISEGPKRIRRDSGLDCDENSSESRCCRYPLTVDFEDFGWD

                **** . .**:** **:**:::: *** *** ******:*:************** :***

 

GDF8_MELGA      WIIAPKRYKANYCSGECEFVFLQKYPHTHLVHQANPRGSAGPCCTPTKMSPINMLYFNGK

GDF8_CHICK      WIIAPKRYKANYCSGECEFVFLQKYPHTHLVHQANPRGSAGPCCTPTKMSPINMLYFNGK

GDF8_HUMAN      WIIAPKRYKANYCSGECEFVFLQKYPHTHLVHQANPRGSAGPCCTPTKMSPINMLYFNGK

GDF8_PAPHA      WIIAPKRYKANYCSGECEFVFLQKYPHTHLVHQANPRGSAGPCCTPTKMSPINMLYFNGK

GDF8_BOVIN      WIIAPKRYKANYCSGECEFVFLQKYPHTHLVHQANPRGSAGPCCTPTKMSPINMLYFNGE

GDF8_SHEEP      WIIAPKRYKANYCSGECEFLFLQKYPHTHLVHQANPKGSAGPCCTPTKMSPINMLYFNGK

GDF8_PIG        WIIAPKRYKANYCSGECEFVFLQKYPHTHLVHQANPRGSAGPCCTPTKMSPINMLYFNGK

GDF8_RAT        WIIAPKRYKANYCSGECEFVFLQKYPHTHLVHQANPRGSAGPCCTPTKMSPINMLYFNGK

GDF8_MOUSE      WIIAPKRYKANYCSGECEFVFLQKYPHTHLVHQANPRGSAGPCCTPTKMSPINMLYFNGK

GDF8_BRARE      WIIAPKRYKANYCSGECDYMYLQKYPHTHLVNKASPRGTAGPCCTPTKMSPINMLYFNGK

                *****************::::**********::*.*:*:********************:

 

GDF8_MELGA      EQIIYGKIPAMVVDRCGCS

GDF8_CHICK      EQIIYGKIPAMVVDRCGCS

GDF8_HUMAN      EQIIYGKIPAMVVDRCGCS

GDF8_PAPHA      EQIIYGKIPAMVVDRCGCS

GDF8_BOVIN      GQIIYGKIPAMVVDRCGCS

GDF8_SHEEP      EQIIYGKIPGMVVDRCGCS

GDF8_PIG        EQIIYGKIPAMVVDRCGCS

GDF8_RAT        EQIIYGKIPAMVVDRCGCS

GDF8_MOUSE      EQIIYGKIPAMVVDRCGCS

GDF8_BRARE      EQIIYGKIPSMVVDRCGCS

                 ********.*********

The “ *  “ were assigned to positions that are fully conserved in all positions (e.g. same amino acids).  The ”  : ”  were assigned to positions that are strongly conserved (amino acids with similar properties). The “ . ”  are assigned to positions that are weakly conserved. The “  “  were assigned to positions that had high variability within comparisons. The alignments of protein sequences yielded from CLUSTLAW showed that the human, chicken, bovin, sheep, pig, rat mouse, and melga sequences consistently aligned. However, the brare protein sequence consistently did not align with any of the other sequences.

 

BOXSHADE output colors aligned sequences based on the degree of similarity between amino acids at every position.  Different colors are assigned to positions that are fully conserved in all position (e.g. same amino acid), nearly conserved (amino acid with similar properties, and not conserved.

 


 


BOXSHADE was used to color the sequences.  The colors had to be changed in order for the sequences could be viewed more clearly.   Colors were changed under Shades/Color Schemes in the BOXSHADE menu. The white was assigned to positions that were fully conserved in all proteins examined.  Yellow was assigned to positions that were strongly conserved in all proteins examined. Blue was assigned to proteins with similar structure and charge but was a different amino acid.  While gray was assigned to positions that had high variability within comparisons.   The consensus sequence aligned consistently the majority of the time with the exception of the Brare protein sequence.

 

 


Drawtree and Drawgram are two tools used to  draw evolutionary trees to see the degree of variance amongst the species compared.  Both tools accomplish the same goal, however they have different approaches. DRAWTREE draws rooted phylogenetic trees from alignment where DRAWTREE draws unrooted phylogenetic trees from alignment.he trees shows that the protein sequences of the rat and mouse; human and papha; bovin, sheep, and pig; and chicken and melga have consistent similarities.  However, the brare protein sequence is very different from the other protein sequences.

 

 

 

 

 

 

 


DRAWGRAM

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

DRAWTREE

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 


GDF8 and GDF11 were aligned using CLUSTALW  to compare to see the differences in the mouse and human protein sequence.  GDF11 is a protein sequence for myostatin in humans.  The alignment between the two protein sequences will be strongly conserved.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 


The alignment showed that there little to know difference between the GDF8 protein and GDF11 protein of mouse and human.

 

 

 

*There was a 99.7% sequence alingnment between the bovine MST and GDF8 protein sequence when check using LFASTA.  LFASTA calculates local sequece alignment.  Thus proving yielding a 99.7% sequence alignment proves that the two proteins are very similar in structure and alignment.

 

The database of genebank, flybase, genebank patent, primate, mammal, rodent, invertebrate, Misc. vertebrate were used to search for myostatin and GDF8.   These databases are used to find nucleic acid sequences of myostatin.  The results were imported in and SIXFRAME which generates and imports 6 frame translations of nucleic sequences were used to see which sequence had the best alignment with the human GDF8. The second frame from all of the seqences aligned with the human GDF8, except for the patent sequence.  In result, an alternative method was used for the patent sequence.  Using LALIGN, which calculates optimal local sequence alignment, the nucleic sequence of the patent and human were aligned by. The local alignment that matched 100% was identified.

 

100.0% identity in 133 nt overlap; score: 665 E(10,000): 5.9e-46 3310 3320 3330 3340 3350 3360 123112 AGATTCACTGGTGTGGCAAGTTGTCTCTCAGACTGTACATGCATTAAAATTTTGCTTGGCATTACTCAAAAGCAAAAGAAAAGTAAAAGGAAGAAACAAGAACAAGAAAAAAGATTATATTGATTTTAAAATC

 

The new sequence was added and SIXFRAME was used to compare the sequence with the human GDF8. The translation of the new nucleic sequence was aligned with the human sequence by using CLUSTALW.

 

 

 

patent_seg_Translated_-_Fram      ------------------------------------------------DS

4028595_Translated_-_Frame_2      ------------------------------------------------DS

GDF8_HUMAN                        VVKAQLWIYLRPVETPTTVFVQILRLIKPMKDGTRYTGIRSLKLDMNPGT

                                                                                  .:

 

patent_seg_Translated_-_Fram      LVWQVVS----------QTVHALKFCLALLKS------------------

4028595_Translated_-_Frame_2      LVWQVVS----------QTVHALKFCLALLKS------------------

GDF8_HUMAN                        GIWQSIDVKTVLQNWLKQPESNLGIEIKALDENGHDLAVTFPGPGEDGLN

                                   :** :.          *.   * : :  *..                 

 

patent_seg_Translated_-_Fram      ---KRKVKGRNKNKKKDY-------------------IDFKI--------

4028595_Translated_-_Frame_2      ---KRKVKGRNKNKKKDY-------------------IDFKIMQKLQLCV

GDF8_HUMAN                        PFLEVKVTDTPKRSRRDFGLDCDEHSTESRCCRYPLTVDFEAFGWDWIIA

                                     : **..  *..::*:                   :**:

 

This proved to be a bad alingnment.  Therefore Clustalw was used to align the new nucleic sequence and GDF8 nucleic sequenece. The alignment was imported and CLUSTALWPROF which aligns sequences to existing alignments Profiles was used to align the human GDF8 with the previous alignment.  The Open gap penalty which has a range from 0-100 was changed to 80.

 

patent_seg_Translated_-_Fra      ---------DSLVWQVVSQTVHALKFCLALLKSKRK-------VKGRNKN

4028595_Translated_-_Frame_      ---------DSLVWQVVSQTVHALKFCLALLKSKRK-------VKGRNKN

GDF8_HUMAN                       MQKLQLCVYIYLFMLIVAGPVDLNENSEQKENVEKEGLCNACTWRQNTKS

                                            *.  :*: .*.  : .    : :::        : ..*.

 

patent_seg_Translated_-_Fra      KKKDYIDFKI----------------------------------------

4028595_Translated_-_Frame_      KKKDYIDFKIMQKLQLCVYIYLFMLIVAGPVDLNENSEQKENVEKEGLCN

GDF8_HUMAN                       SRIEAIKIQILS--------------------------------------

                                 .: : *.::*

 

 

 

PROSEARCH  which searches profile databases for patterns in a protein sequence was used to search for sequences with similar patterns as the human GDF8. The pattern that was searched for was [LIVM]-x(2)-P-x(2)-[FY]-x(4)-C-x-G-x-C .  This  pattern is representative of  a C-terminal domain which is found in TGF-b. SeveraI TGF s that have patterns similar to GDF8 were found.  GDF1 and BMP4 were the protein sequences chosen to viewed.  

 

 

PFSCAN,  a sequence search against a set of profiles,  showed that TGF_BETA_2 C-terminal domain in TGF-beta and other growth factors is in the region of  277 -   375.|

 

PDB Finder, a protein database finder that uses sequences derived from the PDB, DSSP, and HSSP databases, was used to search for a 3D image of myostatin.  The search retrieved no matches.  Therefore, an alternate route had to betaken. The TGF familiy which was proven to be similar to myostatin earlier was used in a PDB Finder search.  Since the two have similarities in the protein sequence their structures will be similar.  It will give a true perception of the myostatin gene.  GDF1 and BMP4 were examined to better understand the structure of the myostatin gene.