Kozak sequence gene transcriptions: Difference between revisions

Jump to navigation Jump to search
(Created page with "{{AE}} Henry A. Hoff The Kozak sequence is a nucleic acid motif that functions as the protein translation initiation site in most eukaryotic mRNA transcripts.<ref nam...")
 
 
(4 intermediate revisions by the same user not shown)
Line 39: Line 39:
|year=2004
|year=2004
|title=Beta+45 G --> C: a novel silent beta-thalassaemia mutation, the first in the Kozak sequence
|title=Beta+45 G --> C: a novel silent beta-thalassaemia mutation, the first in the Kozak sequence
|journal=Br J Haematol
|journal=British Journal of Haematology
|volume=124
|volume=124
|issue=2
|issue=2
Line 57: Line 57:


The sequence was discovered through a detailed analysis of DNA genomic sequences.<ref name=Kozak1984>{{ cite journal
The sequence was discovered through a detailed analysis of DNA genomic sequences.<ref name=Kozak1984>{{ cite journal
|last=Kozak|first=M
|last=Kozak|first=Marilyn
|date=1984-01-25
|date=1984-01-25
|title=Compilation and analysis of sequences upstream from the translational start site in eukaryotic mRNAs
|title=Compilation and analysis of sequences upstream from the translational start site in eukaryotic mRNAs
Line 69: Line 69:


The Kozak Sequence was determined by sequencing of 699 vertebrate mRNAs and verified by [[site-directed mutagenesis]].<ref name=Kozak1987>{{ cite journal
The Kozak Sequence was determined by sequencing of 699 vertebrate mRNAs and verified by [[site-directed mutagenesis]].<ref name=Kozak1987>{{ cite journal
|author=Kozak M
|author=Kozak Marilyn
|date=October 1987
|date=October 1987
|title=An analysis of 5'-noncoding sequences from 699 vertebrate messenger RNAs
|title=An analysis of 5'-noncoding sequences from 699 vertebrate messenger RNAs
Line 84: Line 84:
==Consensus sequences==
==Consensus sequences==


Kozak consensus sequence is 5'-GAAAATGG-3'.<ref name=Matsumoto>{{ cite journal
Kozak consensus sequence is GAAAATGG.<ref name=Matsumoto>{{ cite journal
|author=Takuya Matsumoto, Saemi Kitajima, Chisato Yamamoto, Mitsuru Aoyagi, Yoshiharu Mitoma, Hiroyuki Harada and Yuji Nagashima
|author=Takuya Matsumoto, Saemi Kitajima, Chisato Yamamoto, Mitsuru Aoyagi, Yoshiharu Mitoma, Hiroyuki Harada and Yuji Nagashima
|title=Cloning and tissue distribution of the ATP-binding cassette subfamily G member 2 gene in the marine pufferfish ''Takifugu rubripes''
|title=Cloning and tissue distribution of the ATP-binding cassette subfamily G member 2 gene in the marine pufferfish ''Takifugu rubripes''
Line 101: Line 101:
Consensus sequence for the Kozak is 5'-(GCC)GCC(A/G)CCATGG-3'.<ref name=Kozak1987/>
Consensus sequence for the Kozak is 5'-(GCC)GCC(A/G)CCATGG-3'.<ref name=Kozak1987/>


==Samplings==
==(Kozak) samplings==


Copying an apparent consensus sequence for the Kozak sequence of (GCC)GCC(A/G)CCATGG OR GCCACCAT and putting it in "⌘F" finds none located between ZSCAN22 and A1BG and none between ZNF497 and A1BG as can be found by the computer programs.
Copying an apparent consensus sequence for the Kozak sequence of (GCC)GCC(A/G)CCATGG or GCCACCAT and putting it in "⌘F" finds none located between ZSCAN22 and A1BG and none between ZNF497 and A1BG as can be found by the computer programs.
 
For the Basic programs testing consensus sequence GCCGCC(A/G)CCATGG (starting with SuccessablesKoz.bas) written to compare nucleotide sequences with the sequences on either the template strand (-), or coding strand (+), of the DNA, in the negative direction (-), or the positive direction (+), the programs are, are looking for, and found:
# negative strand, negative direction, looking for GCCGCC(A/G)CCATGG, 0.
# positive strand, negative direction, looking for GCCGCC(A/G)CCATGG, 0.
# positive strand, positive direction, looking for GCCGCC(A/G)CCATGG, 0.
# negative strand, positive direction, looking for GCCGCC(A/G)CCATGG, 0.
# complement, negative strand, negative direction, looking for CGGCGG(C/T)GGTACC, 0.
# complement, positive strand, negative direction, looking for CGGCGG(C/T)GGTACC, 0.
# complement, positive strand, positive direction, looking for CGGCGG(C/T)GGTACC, 0.
# complement, negative strand, positive direction, looking for CGGCGG(C/T)GGTACC, 0.
# inverse complement, negative strand, negative direction, looking for CCATGG(C/T)GGCGGC, 0.
# inverse complement, positive strand, negative direction, looking for CCATGG(C/T)GGCGGC, 0.
# inverse complement, positive strand, positive direction, looking for CCATGG(C/T)GGCGGC, 0.
# inverse complement, negative strand, positive direction, looking for CCATGG(C/T)GGCGGC, 0.
# inverse positive strand, negative direction, looking for GGTACC(A/G)CCGCCG, 0.
# inverse negative strand, negative direction, looking for GGTACC(A/G)CCGCCG, 0.
# inverse positive strand, positive direction, looking for GGTACC(A/G)CCGCCG, 0.
# inverse negative strand, positive direction, looking for GGTACC(A/G)CCGCCG, 0.
 
==(Matsumoto) samplings==
 
Copying an apparent consensus sequence for the Kozak sequence of GAAAATGG and putting it in "⌘F" finds none located between ZSCAN22 and A1BG and none between ZNF497 and A1BG as can be found by the computer programs.
 
For the Basic programs testing consensus sequence GAAAATGG (starting with SuccessablesKozM.bas) written to compare nucleotide sequences with the sequences on either the template strand (-), or coding strand (+), of the DNA, in the negative direction (-), or the positive direction (+), the programs are, are looking for, and found:
# negative strand, negative direction, looking for GAAAATGG, 0.
# positive strand, negative direction, looking for GAAAATGG, 0.
# positive strand, positive direction, looking for GAAAATGG, 0.
# negative strand, positive direction, looking for GAAAATGG, 0.
# complement, negative strand, negative direction, looking for CTTTTACC, 0.
# complement, positive strand, negative direction, looking for CTTTTACC, 0.
# complement, positive strand, positive direction, looking for CTTTTACC, 0.
# complement, negative strand, positive direction, looking for CTTTTACC, 0.
# inverse complement, negative strand, negative direction, looking for CCATTTTC, 0.
# inverse complement, positive strand, negative direction, looking for CCATTTTC, 0.
# inverse complement, positive strand, positive direction, looking for CCATTTTC, 0.
# inverse complement, negative strand, positive direction, looking for CCATTTTC, 0.
# inverse negative strand, negative direction, looking for GGTAAAAG, 0.
# inverse positive strand, negative direction, looking for GGTAAAAG, 0.
# inverse positive strand, positive direction, looking for GGTAAAAG, 0.
# inverse negative strand, positive direction, looking for GGTAAAAG, 0.
 
==Acknowledgements==
 
The content on this page was first contributed by: Henry A. Hoff.


==See also==
==See also==
{{div col|colwidth=20em}}
{{div col|colwidth=20em}}
* [[A1BG gene transcription core promoters]]
* [[A1BG gene transcriptions]]
* [[A1BG regulatory elements and regions]]
* [[A1BG response element negative results]]
* [[A1BG response element positive results]]
* [[Complex locus A1BG and ZNF497]]
* [[Complex locus A1BG and ZNF497]]
* [[Transcription factor]]
{{Div col end}}
{{Div col end}}


Line 115: Line 163:


==External links==
==External links==
* [http://www.genome.jp/ GenomeNet KEGG database]
* [http://www.ncbi.nlm.nih.gov/sites/entrez?db=gene Home - Gene - NCBI]
* [http://www.ncbi.nlm.nih.gov/sites/gquery NCBI All Databases Search]
* [http://www.ncbi.nlm.nih.gov/ncbisearch/ NCBI Site Search]
* [http://www.ncbi.nlm.nih.gov/pccompound PubChem Public Chemical Database]


<!-- footer templates -->
<!-- footer templates -->
Line 120: Line 173:


<!-- footer categories -->
<!-- footer categories -->
[[Category:Resources last modified in September 2020]]

Latest revision as of 01:44, 21 April 2022

Associate Editor(s)-in-Chief: Henry A. Hoff

The Kozak sequence is a nucleic acid motif that functions as the protein translation initiation site in most eukaryotic mRNA transcripts.[1] Regarded as the optimum sequence for initiating translation in eukaryotes, the sequence is an integral aspect of protein regulation and overall cellular health as well as having implications in human disease.[1][2]

A wrong start site can result in non-functional proteins.[3]

As it has become more studied, expansions of the nucleotide sequence, bases of importance, and notable exceptions have arisen.[1][4][5]

The sequence was discovered through a detailed analysis of DNA genomic sequences.[6]

The Kozak Sequence was determined by sequencing of 699 vertebrate mRNAs and verified by site-directed mutagenesis.[7] While initially limited to a subset of vertebrates (i.e. human, cow, cat, dog, chicken, guinea pig, hamster, mouse, pig, rabbit, sheep, and Xenopus), subsequent studies confirmed its conservation in higher eukaryotes generally.[1] The sequence was defined as 5'-(gcc)gccRccATGG-3' IUPAC nucleobase notation.[7]

Human genes

Consensus sequences

Kozak consensus sequence is GAAAATGG.[8]

Consensus sequence for the Kozak is 5'-(GCC)GCC(A/G)CCATGG-3'.[7]

(Kozak) samplings

Copying an apparent consensus sequence for the Kozak sequence of (GCC)GCC(A/G)CCATGG or GCCACCAT and putting it in "⌘F" finds none located between ZSCAN22 and A1BG and none between ZNF497 and A1BG as can be found by the computer programs.

For the Basic programs testing consensus sequence GCCGCC(A/G)CCATGG (starting with SuccessablesKoz.bas) written to compare nucleotide sequences with the sequences on either the template strand (-), or coding strand (+), of the DNA, in the negative direction (-), or the positive direction (+), the programs are, are looking for, and found:

  1. negative strand, negative direction, looking for GCCGCC(A/G)CCATGG, 0.
  2. positive strand, negative direction, looking for GCCGCC(A/G)CCATGG, 0.
  3. positive strand, positive direction, looking for GCCGCC(A/G)CCATGG, 0.
  4. negative strand, positive direction, looking for GCCGCC(A/G)CCATGG, 0.
  5. complement, negative strand, negative direction, looking for CGGCGG(C/T)GGTACC, 0.
  6. complement, positive strand, negative direction, looking for CGGCGG(C/T)GGTACC, 0.
  7. complement, positive strand, positive direction, looking for CGGCGG(C/T)GGTACC, 0.
  8. complement, negative strand, positive direction, looking for CGGCGG(C/T)GGTACC, 0.
  9. inverse complement, negative strand, negative direction, looking for CCATGG(C/T)GGCGGC, 0.
  10. inverse complement, positive strand, negative direction, looking for CCATGG(C/T)GGCGGC, 0.
  11. inverse complement, positive strand, positive direction, looking for CCATGG(C/T)GGCGGC, 0.
  12. inverse complement, negative strand, positive direction, looking for CCATGG(C/T)GGCGGC, 0.
  13. inverse positive strand, negative direction, looking for GGTACC(A/G)CCGCCG, 0.
  14. inverse negative strand, negative direction, looking for GGTACC(A/G)CCGCCG, 0.
  15. inverse positive strand, positive direction, looking for GGTACC(A/G)CCGCCG, 0.
  16. inverse negative strand, positive direction, looking for GGTACC(A/G)CCGCCG, 0.

(Matsumoto) samplings

Copying an apparent consensus sequence for the Kozak sequence of GAAAATGG and putting it in "⌘F" finds none located between ZSCAN22 and A1BG and none between ZNF497 and A1BG as can be found by the computer programs.

For the Basic programs testing consensus sequence GAAAATGG (starting with SuccessablesKozM.bas) written to compare nucleotide sequences with the sequences on either the template strand (-), or coding strand (+), of the DNA, in the negative direction (-), or the positive direction (+), the programs are, are looking for, and found:

  1. negative strand, negative direction, looking for GAAAATGG, 0.
  2. positive strand, negative direction, looking for GAAAATGG, 0.
  3. positive strand, positive direction, looking for GAAAATGG, 0.
  4. negative strand, positive direction, looking for GAAAATGG, 0.
  5. complement, negative strand, negative direction, looking for CTTTTACC, 0.
  6. complement, positive strand, negative direction, looking for CTTTTACC, 0.
  7. complement, positive strand, positive direction, looking for CTTTTACC, 0.
  8. complement, negative strand, positive direction, looking for CTTTTACC, 0.
  9. inverse complement, negative strand, negative direction, looking for CCATTTTC, 0.
  10. inverse complement, positive strand, negative direction, looking for CCATTTTC, 0.
  11. inverse complement, positive strand, positive direction, looking for CCATTTTC, 0.
  12. inverse complement, negative strand, positive direction, looking for CCATTTTC, 0.
  13. inverse negative strand, negative direction, looking for GGTAAAAG, 0.
  14. inverse positive strand, negative direction, looking for GGTAAAAG, 0.
  15. inverse positive strand, positive direction, looking for GGTAAAAG, 0.
  16. inverse negative strand, positive direction, looking for GGTAAAAG, 0.

Acknowledgements

The content on this page was first contributed by: Henry A. Hoff.

See also

References

  1. 1.0 1.1 1.2 1.3 Kozak, Marilyn (February 1989). "The scanning model for translation: an update". The Journal of Cell Biology. 108 (2): 229–241. doi:10.1083/jcb.108.2.229. ISSN 0021-9525. PMID 2645293.
  2. Kozak, Marilyn (2002-10-16). "Pushing the limits of the scanning mechanism for initiation of translation". Gene. 299 (1): 1–34. doi:10.1016/S0378-1119(02)01056-9. ISSN 0378-1119. PMID 12459250.
  3. Kozak, Marilyn (1999-07-08). "Initiation of translation in prokaryotes and eukaryotes". Gene. 234 (2): 187–208. doi:10.1016/S0378-1119(99)00210-3. ISSN 0378-1119. PMID 10395892.
  4. De Angioletti M, Lacerra G, Sabato V, Carestia C (2004). "Beta+45 G --> C: a novel silent beta-thalassaemia mutation, the first in the Kozak sequence". British Journal of Haematology. 124 (2): 224–31. doi:10.1046/j.1365-2141.2003.04754.x. PMID 14687034.
  5. Hernández, Greco; Osnaya, Vincent G.; Pérez-Martínez, Xochitl (2019-07-25). "Conservation and Variability of the AUG Initiation Codon Context in Eukaryotes". Trends in Biochemical Sciences. 44 (12): 1009–1021. doi:10.1016/j.tibs.2019.07.001. ISSN 0968-0004. PMID 31353284.
  6. Kozak, Marilyn (1984-01-25). "Compilation and analysis of sequences upstream from the translational start site in eukaryotic mRNAs". Nucleic Acids Research. 12 (2): 857–872. doi:10.1093/nar/12.2.857. ISSN 0305-1048. PMID 6694911.
  7. 7.0 7.1 7.2 Kozak Marilyn (October 1987). "An analysis of 5'-noncoding sequences from 699 vertebrate messenger RNAs". Nucleic Acids Research. 15 (20): 8125–8148. doi:10.1093/nar/15.20.8125. PMID 3313277.
  8. Takuya Matsumoto, Saemi Kitajima, Chisato Yamamoto, Mitsuru Aoyagi, Yoshiharu Mitoma, Hiroyuki Harada and Yuji Nagashima (9 August 2020). "Cloning and tissue distribution of the ATP-binding cassette subfamily G member 2 gene in the marine pufferfish Takifugu rubripes" (PDF). Fisheries Science. 86: 873–887. doi:10.1007/s12562-020-01451-z. Retrieved 27 September 2020.

External links