AGC box gene transcriptions: Difference between revisions
(30 intermediate revisions by the same user not shown) | |||
Line 220: | Line 220: | ||
"An AGC box (AGCCGCC) was found [from peach (''Prunus persica'' L. Batsch cv. Loring)] between 886 and 892 bp upstream of the translation start site which has been shown in other ethylene-responsive PR genes to be a binding site for ethylene-responsive binding factor proteins (ERF proteins) (Ohme-Takagi and Shinshi, 1995; Sato et al., 1996; Jia and Martin, 1999; Fujimoto et al., 2000)."<ref name=Moon/> | "An AGC box (AGCCGCC) was found [from peach (''Prunus persica'' L. Batsch cv. Loring)] between 886 and 892 bp upstream of the translation start site which has been shown in other ethylene-responsive PR genes to be a binding site for ethylene-responsive binding factor proteins (ERF proteins) (Ohme-Takagi and Shinshi, 1995; Sato et al., 1996; Jia and Martin, 1999; Fujimoto et al., 2000)."<ref name=Moon/> | ||
"The peach ACO1 does have an AGC box that has been found to bind ethylene responsive elements in response to pathogen infections (Ohme-Takagi et al., 2000; Rushton et al., 2002). Only the apple ACO1 also contains this sequence. In addition, both PpACO1 and the apple ACO1 have a MADS box transcription factor binding site (CarG) (Tilly et al., 1998), but none of the other ACO genes do. "<ref name=Moon/> | "The peach ACO1 does have an AGC box that has been found to bind ethylene responsive elements in response to pathogen infections (Ohme-Takagi et al., 2000; Rushton et al., 2002). Only the apple ACO1 also contains this sequence. In addition, both PpACO1 and the apple ACO1 have a MADS box transcription factor binding site (CarG) (Tilly et al., 1998), but none of the other ACO genes do."<ref name=Moon/> | ||
==E2F4== | ==E2F4== | ||
[[Image:Protein E2F4 PDB 1cf7.png|right|thumb|250px|Structure of the E2F4 protein shown is based on PyMOL rendering of PDB 1cf7. Credit: [[commons:User:Emw|Emw]].]] | [[Image:Protein E2F4 PDB 1cf7.png|right|thumb|250px|Structure of the E2F4 protein shown is based on PyMOL rendering of PDB 1cf7. Credit: [[commons:User:Emw|Emw]].]] | ||
Gene ID: 1874 - "The protein encoded by this gene is a member of the E2F family of transcription factors. The E2F family plays a crucial role in the control of cell cycle and action of tumor suppressor proteins and is also a target of the transforming proteins of small DNA tumor viruses. The E2F proteins contain several evolutionally conserved domains found in most members of the family. These domains include a DNA binding domain, a dimerization domain which determines interaction with the differentiation regulated transcription factor proteins (DP), a transactivation domain enriched in acidic amino acids, and a tumor suppressor protein association domain which is embedded within the transactivation domain. This protein binds to all three of the tumor suppressor proteins pRB, p107 and p130, but with higher affinity to the last two. It plays an important role in the suppression of proliferation-associated genes, and its gene mutation and increased expression may be associated with human cancer."<ref name= | Gene ID: 1874 - "The protein encoded by this gene is a member of the E2F family of transcription factors. The E2F family plays a crucial role in the control of cell cycle and action of tumor suppressor proteins and is also a target of the transforming proteins of small DNA tumor viruses. The E2F proteins contain several evolutionally conserved domains found in most members of the family. These domains include a DNA binding domain, a dimerization domain which determines interaction with the differentiation regulated transcription factor proteins (DP), a transactivation domain enriched in acidic amino acids, and a tumor suppressor protein association domain which is embedded within the transactivation domain. This protein binds to all three of the tumor suppressor proteins pRB, p107 and p130, but with higher affinity to the last two. It plays an important role in the suppression of proliferation-associated genes, and its gene mutation and increased expression may be associated with human cancer."<ref name=RefSeq1874>{{ cite book | ||
|author=RefSeqJuly2008 | |author=RefSeqJuly2008 | ||
|title=E2F4 E2F transcription factor 4 [ Homo sapiens (human) ] | |title=E2F4 E2F transcription factor 4 [ Homo sapiens (human) ] | ||
Line 256: | Line 256: | ||
For the Basic programs (starting with SuccessablesAGC.bas) written to compare nucleotide sequences with the sequences on either the template strand (-), or coding strand (+), of the DNA, in the negative direction (-), or the positive direction (+), including extending the number of nts from 958 to 4445, the programs are, are looking for, and found: | For the Basic programs (starting with SuccessablesAGC.bas) written to compare nucleotide sequences with the sequences on either the template strand (-), or coding strand (+), of the DNA, in the negative direction (-), or the positive direction (+), including extending the number of nts from 958 to 4445, the programs are, are looking for, and found: | ||
# negative strand in the negative direction is SuccessablesAGC--.bas, looking for | # negative strand in the negative direction is SuccessablesAGC--.bas, looking for AGCCGCC, 0, | ||
# negative strand in the positive direction is SuccessablesAGC-+.bas, looking for | # negative strand in the positive direction is SuccessablesAGC-+.bas, looking for AGCCGCC, 0, | ||
# positive strand in the negative direction is SuccessablesAGC+-.bas, looking for | # positive strand in the negative direction is SuccessablesAGC+-.bas, looking for AGCCGCC, 0, | ||
# positive strand in the positive direction is SuccessablesAGC++.bas, looking for | # positive strand in the positive direction is SuccessablesAGC++.bas, looking for AGCCGCC, 0, | ||
# complement, negative strand, negative direction is | # inverse complement, negative strand, negative direction is SuccessablesAGCci--.bas, looking for GGCGGCT: 0. | ||
# complement, negative strand, positive direction is | # inverse complement, negative strand, positive direction is SuccessablesAGCci-+.bas, looking for GGCGGCT: 0. | ||
# complement, positive strand, negative direction is | # inverse complement, positive strand, negative direction is SuccessablesAGCci+-.bas, looking for GGCGGCT: 1, GGCGGCT at 1754. | ||
# complement, positive strand, | # inverse complement, positive strand, positive direction is SuccessablesAGCci++.bas, looking for GGCGGCT: 0. | ||
# | |||
# | ===AGCbox negative direction (2596-1) distal promoters=== | ||
# | |||
# Positive strand, negative direction: GGCGGCT at 1754. | |||
==AGC random dataset samplings== | |||
# AGCr0: 1, AGCCGCC at 2380. | |||
# AGCr1: 0. | |||
# AGCr2: 0. | |||
# AGCr3: 2, AGCCGCC at 4138, AGCCGCC at 1452. | |||
# AGCr4: 1, AGCCGCC at 80. | |||
# AGCr5: 1, AGCCGCC at 4353. | |||
# AGCr6: 0. | |||
# AGCr7: 0. | |||
# AGCr8: 0. | |||
# AGCr9: 1, AGCCGCC at 2449. | |||
# AGCr0ci: 1, GGCGGCT at 3548. | |||
# AGCr1ci: 0. | |||
# AGCr2ci: 1, GGCGGCT at 4349. | |||
# AGCr3ci: 1, GGCGGCT at 1443. | |||
# AGCr4ci: 1, GGCGGCT at 4110. | |||
# AGCr5ci: 0. | |||
# AGCr6ci: 0. | |||
# AGCr7ci: 0. | |||
# AGCr8ci: 0. | |||
# AGCr9ci: 0. | |||
===AGCr arbitrary (evens) (4560-2846) UTRs=== | |||
# AGCr0ci: GGCGGCT at 3548. | |||
# AGCr2ci: GGCGGCT at 4349. | |||
# AGCr4ci: GGCGGCT at 4110. | |||
===AGCr alternate (odds) (4560-2846) UTRs=== | |||
# AGCr3: AGCCGCC at 4138. | |||
# AGCr5: AGCCGCC at 4353. | |||
===AGCr arbitrary positive direction (odds) (4445-4265) core promoters=== | |||
# AGCr5: AGCCGCC at 4353. | |||
===AGCr alternate positive direction (evens) (4445-4265) core promoters=== | |||
# AGCr2ci: GGCGGCT at 4349. | |||
===AGCr arbitrary positive direction (odds) (4265-4050) proximal promoters=== | |||
# AGCr3: AGCCGCC at 4138. | |||
===AGCr alternate positive direction (evens) (4265-4050) proximal promoters=== | |||
# AGCr4ci: GGCGGCT at 4110. | |||
===AGCr arbitrary negative direction (evens) (2596-1) distal promoters=== | |||
# AGCr0: AGCCGCC at 2380. | |||
# AGCr4: AGCCGCC at 80. | |||
===AGCr alternate negative direction (odds) (2596-1) distal promoters=== | |||
# AGCr3: AGCCGCC at 1452. | |||
# AGCr9: AGCCGCC at 2449. | |||
# AGCr3ci: GGCGGCT at 1443. | |||
===AGCr arbitrary positive direction (odds) (4050-1) distal promoters=== | |||
# AGCr3: AGCCGCC at 1452. | |||
# AGCr9: AGCCGCC at 2449. | |||
# AGCr3ci: GGCGGCT at 1443. | |||
===AGCr alternate positive direction (evens) (4050-1) distal promoters=== | |||
# AGCr0: AGCCGCC at 2380. | |||
# AGCr4: AGCCGCC at 80. | |||
# AGCr0ci: GGCGGCT at 3548. | |||
==AGC box analysis and results== | |||
{{main|Complex locus A1BG and ZNF497#AGC boxes}} | |||
"An AGC box (AGCCGCC) was found [from peach (''Prunus persica'' L. Batsch cv. Loring)] between 886 and 892 bp upstream of the translation start site which has been shown in other ethylene-responsive PR genes to be a binding site for ethylene-responsive binding factor proteins (ERF proteins) (Ohme-Takagi and Shinshi, 1995; Sato et al., 1996; Jia and Martin, 1999; Fujimoto et al., 2000)."<ref name=Moon/> | |||
{|class="wikitable" | |||
|- | |||
! Reals or randoms !! Promoters !! direction !! Numbers !! Strands !! Occurrences !! Averages (± 0.1) | |||
|- | |||
| Reals || UTR || negative || 0 || 2 || 0 || 0 | |||
|- | |||
| Randoms || UTR || arbitrary negative || 3 || 10 || 0.3 || 0.25 | |||
|- | |||
| Randoms || UTR || alternate negative || 2 || 10 || 0.2 || 0.25 | |||
|- | |||
| Reals || Core || negative || 0 || 2 || 0 || 0 | |||
|- | |||
| Randoms || Core || arbitrary negative || 0 || 10 || 0 || 0 | |||
|- | |||
| Randoms || Core || alternate negative || 0 || 10 || 0 || 0 | |||
|- | |||
| Reals || Core || positive || 0 || 2 || 0 || 0 | |||
|- | |||
| Randoms || Core || arbitrary positive || 1 || 10 || 0.1 || 0.1 | |||
|- | |||
| Randoms || Core || alternate positive || 1 || 10 || 0.1 || 0.1 | |||
|- | |||
| Reals || Proximal || negative || 0 || 2 || 0 || 0 | |||
|- | |||
| Randoms || Proximal || arbitrary negative || 0 || 10 || 0 || 0 | |||
|- | |||
| Randoms || Proximal || alternate negative || 0 || 10 || 0 || 0 | |||
|- | |||
| Reals || Proximal || positive || 0 || 2 || 0 || 0 | |||
|- | |||
| Randoms || Proximal || arbitrary positive || 1 || 10 || 0.1 || 0.1 | |||
|- | |||
| Randoms || Proximal || alternate positive || 1 || 10 || 0.1 || 0.1 | |||
|- | |||
| Reals || Distal || negative || 1 || 2 || 0.5 || 0.5 | |||
|- | |||
| Randoms || Distal || arbitrary negative || 2 || 10 || 0.2 || 0.25 | |||
|- | |||
| Randoms || Distal || alternate negative || 3 || 10 || 0.3 || 0.25 | |||
|- | |||
| Reals || Distal || positive || 0 || 2 || 0 || 0 | |||
|- | |||
| Randoms || Distal || arbitrary positive || 3 || 10 || 0.3 || 0.3 | |||
|- | |||
| Randoms || Distal || alternate positive || 3 || 10 || 0.3 || 0.3 | |||
|} | |||
Comparison: | |||
The occurrence of a real AGC box is greater than the randoms. This suggests that the real AGC box is likely active or activable. | |||
==GCC box samplings== | ==GCC box samplings== | ||
Copying | Copying GCCGCC in "⌘F" yields one between ZSCAN22 and A1BG and two between ZNF497 and A1BG as can be found by the computer programs. | ||
For the Basic programs (starting with SuccessablesGCC.bas) written to compare nucleotide sequences with the sequences on either the template strand (-), or coding strand (+), of the DNA, in the negative direction (-), or the positive direction (+), including extending the number of nts from 958 to 4445, the programs are, are looking for, and found: | For the Basic programs (starting with SuccessablesGCC.bas) written to compare nucleotide sequences with the sequences on either the template strand (-), or coding strand (+), of the DNA, in the negative direction (-), or the positive direction (+), including extending the number of nts from 958 to 4445, the programs are, are looking for, and found: | ||
# negative strand | # negative strand, negative direction, looking for GCCGCC, 1, GCCGCC at 2727. | ||
# positive strand | # positive strand, negative direction, looking for GCCGCC, 0. | ||
# | # negative strand, positive direction, looking for GCCGCC, 2, GCCGCC at 1757, GCCGCC at 904. | ||
# positive strand, positive direction, looking for GCCGCC, 1, GCCGCC at 356. | |||
# | |||
# inverse complement, negative strand, negative direction, looking for GGCGGC, 0. | # inverse complement, negative strand, negative direction, looking for GGCGGC, 0. | ||
# inverse complement, positive strand, negative direction, looking for GGCGGC, 1, GGCGGC at 1753. | # inverse complement, positive strand, negative direction, looking for GGCGGC, 1, GGCGGC at 1753. | ||
# inverse complement, positive strand, positive direction, looking for GGCGGC, 0. | # inverse complement, positive strand, positive direction, looking for GGCGGC, 0. | ||
# inverse complement, negative strand, positive direction, looking for GGCGGC, 3, GGCGGC at 1902, GGCGGC at 1794, GGCGGC at 354. | # inverse complement, negative strand, positive direction, looking for GGCGGC, 3, GGCGGC at 1902, GGCGGC at 1794, GGCGGC at 354. | ||
=== | ===AGC negative direction (2811-2596) proximal promoters=== | ||
# Negative strand, negative direction: GCCGCC at 2727. | |||
===AGC negative direction (2596-1) distal promoters=== | |||
# Positive strand, negative direction: GGCGGC at 1753. | |||
Positive strand, positive direction: GCCGCC at 356. | ===AGC positive direction (4050-1) distal promoters=== | ||
# Negative strand, positive direction: GCCGCC at 1757, GCCGCC at 904. | |||
# Negative strand, positive direction: GGCGGC at 1902, GGCGGC at 1794, GGCGGC at 354. | |||
# Positive strand, positive direction: GCCGCC at 356. | |||
==GCC random dataset samplings== | ==GCC random dataset samplings== | ||
Line 320: | Line 438: | ||
# GCCr8: 2, GCCGCC at 2518, GCCGCC at 2473. | # GCCr8: 2, GCCGCC at 2518, GCCGCC at 2473. | ||
# GCCr9: 3, GCCGCC at 2666, GCCGCC at 2449, GCCGCC at 1415. | # GCCr9: 3, GCCGCC at 2666, GCCGCC at 2449, GCCGCC at 1415. | ||
# | # GCCr0ci: 1, GGCGGC at 3547. | ||
# | # GCCr1ci: 0. | ||
# | # GCCr2ci: 1, GGCGGC at 4348. | ||
# | # GCCr3ci: 1, GGCGGC at 1442. | ||
# | # GCCr4ci: 1, GGCGGC at 4109. | ||
# | # GCCr5ci: 2, GGCGGC at 2932, GGCGGC at 678. | ||
# | # GCCr6ci: 1, GGCGGC at 4434. | ||
# | # GCCr7ci: 0. | ||
# | # GCCr8ci: 1, GGCGGC at 4280. | ||
# | # GCCr9ci: 3, GGCGGC at 3896, GGCGGC at 3628, GGCGGC at 1727. | ||
===GCCr arbitrary (evens) (4560-2846) UTRs=== | |||
# GCCr0: GCCGCC at 3407. | # GCCr0: GCCGCC at 3407. | ||
# GCCr2: GCCGCC at 3586. | # GCCr2: GCCGCC at 3586. | ||
# GCCr0ci: GGCGGC at 3547. | |||
# GCCr2ci: GGCGGC at 4348. | |||
# GCCr4ci: GGCGGC at 4109. | |||
# GCCr6ci: GGCGGC at 4434. | |||
# GCCr8ci: GGCGGC at 4280. | |||
===GCCr | ===GCCr alternate (odds) (4560-2846) UTRs=== | ||
# GCCr3: GCCGCC at 4138. | |||
# GCCr5: GCCGCC at 4353. | # GCCr5: GCCGCC at 4353. | ||
# GCCr5ci: GGCGGC at 2932. | |||
# GCCr9ci: GGCGGC at 3896, GGCGGC at 3628. | |||
===GCCr arbitrary positive direction (odds) (4445-4265) core promoters=== | |||
# GCCr5: GCCGCC at 4353. | |||
===GCCr alternate positive direction (evens) (4445-4265) core promoters=== | |||
# GCCr2ci: GGCGGC at 4348. | |||
# GCCr6ci: GGCGGC at 4434. | |||
# GCCr8ci: GGCGGC at 4280. | |||
===GCCr arbitrary negative direction (evens) (2811-2596) proximal promoters=== | |||
# GCCr2: GCCGCC at 2598. | # GCCr2: GCCGCC at 2598. | ||
===GCCr alternate negative direction (odds) (2811-2596) proximal promoters=== | |||
# GCCr3: GCCGCC at 2792. | |||
# GCCr9: GCCGCC at 2666. | |||
===GCCr arbitrary positive direction (odds) (4265-4050) proximal promoters=== | |||
# GCCr3: GCCGCC at 4138. | # GCCr3: GCCGCC at 4138. | ||
===GCCr distal promoters=== | ===GCCr alternate positive direction (evens) (4265-4050) proximal promoters=== | ||
# GCCr4ci: GGCGGC at 4109. | |||
===GCCr arbitrary negative direction (evens) (2596-1) distal promoters=== | |||
# GCCr0: GCCGCC at 2380, GCCGCC at 1384. | # GCCr0: GCCGCC at 2380, GCCGCC at 1384. | ||
# GCCr2: GCCGCC at 1966. | # GCCr2: GCCGCC at 1966. | ||
# GCCr4: GCCGCC at 1092, GCCGCC at 1089, GCCGCC at 1022, GCCGCC at 80. | # GCCr4: GCCGCC at 1092, GCCGCC at 1089, GCCGCC at 1022, GCCGCC at 80. | ||
# GCCr8: | # GCCr8: GCCGCC at 2518, GCCGCC at 2473. | ||
===GCCr alternate negative direction (odds) (2596-1) distal promoters=== | |||
# GCCr3: GCCGCC at 1452. | |||
# GCCr7: GCCGCC at 1770. | |||
# GCCr9: GCCGCC at 2449, GCCGCC at 1415. | |||
# GCCr3ci: GGCGGC at 1442. | |||
# GCCr5ci: GGCGGC at 678. | |||
# GCCr9ci: GGCGGC at 1727. | |||
===GCCr arbitrary positive direction (odds) (4050-1) distal promoters=== | |||
# GCCr3: GCCGCC at 2792, GCCGCC at 1452. | # GCCr3: GCCGCC at 2792, GCCGCC at 1452. | ||
# GCCr7: GCCGCC at 1770. | # GCCr7: GCCGCC at 1770. | ||
# GCCr9: GCCGCC at 2666, GCCGCC at 2449, GCCGCC at 1415. | # GCCr9: GCCGCC at 2666, GCCGCC at 2449, GCCGCC at 1415. | ||
# GCCr3ci: GGCGGC at 1442. | |||
# GCCr5ci: GGCGGC at 2932, GGCGGC at 678. | |||
# GCCr9ci: GGCGGC at 3896, GGCGGC at 3628, GGCGGC at 1727. | |||
===GCCr alternate positive direction (evens) (4050-1) distal promoters=== | |||
# GCCr0: GCCGCC at 3407, GCCGCC at 2380, GCCGCC at 1384. | |||
# GCCr2: GCCGCC at 3586, GCCGCC at 2598, GCCGCC at 1966. | |||
# GCCr4: GCCGCC at 1092, GCCGCC at 1089, GCCGCC at 1022, GCCGCC at 80. | |||
# GCCr8: GCCGCC at 2518, GCCGCC at 2473. | |||
# GCCr0ci: GGCGGC at 3547. | |||
==GCC box analysis and results== | |||
{{main|Complex locus A1BG and ZNF497#GCC boxes}} | |||
"Expression of the osmotin gene is similar to that of the OLP gene. The osmotin gene also has several AGCCGCC sequences; a complete AGCCGCC (from -50 to -44), a slightly modified CGCCGCC (from -144 to -138), and an AGCCGCC sequence in reverse orientation (from -162 to -156)."<ref name=Sato/> | |||
{|class="wikitable" | |||
|- | |||
! Reals or randoms !! Promoters !! direction !! Numbers !! Strands !! Occurrences !! Averages (± 0.1) | |||
|- | |||
| Reals || UTR || negative || 0 || 2 || 0 || 0 | |||
|- | |||
| Randoms || UTR || arbitrary negative || 7 || 10 || 0.7 || 0.6 | |||
|- | |||
| Randoms || UTR || alternate negative || 5 || 10 || 0.5 || 0.6 | |||
|- | |||
| Reals || Core || negative || 0 || 2 || 0 || 0 | |||
|- | |||
| Randoms || Core || arbitrary negative || 0 || 10 || 0 || 0 | |||
|- | |||
| Randoms || Core || alternate negative || 0 || 10 || 0 || 0 | |||
|- | |||
| Reals || Core || positive || 0 || 2 || 0 || 0 | |||
|- | |||
| Randoms || Core || arbitrary positive || 1 || 10 || 0.1 || 0 | |||
|- | |||
| Randoms || Core || alternate positive || 3 || 10 || 0.3 || 0 | |||
|- | |||
| Reals || Proximal || negative || 1 || 2 || 0.5 || 0.5 ± 0.5 (--1,+-0) | |||
|- | |||
| Randoms || Proximal || arbitrary negative || 1 || 10 || 0.1 || 0.15 | |||
|- | |||
| Randoms || Proximal || alternate negative || 2 || 10 || 0.2 || 0.15 | |||
|- | |||
| Reals || Proximal || positive || 0 || 2 || 0 || 0 | |||
|- | |||
| Randoms || Proximal || arbitrary positive || 1 || 10 || 0.1 || 0.1 | |||
|- | |||
| Randoms || Proximal || alternate positive || 1 || 10 || 0.1 || 0.1 | |||
|- | |||
| Reals || Distal || negative || 1 || 2 || 0.5 || 0.5 ± 0.5 (--0,+-1) | |||
|- | |||
| Randoms || Distal || arbitrary negative || 9 || 10 || 0.9 || 0.8 | |||
|- | |||
| Randoms || Distal || alternate negative || 7 || 10 || 0.7 || 0.8 | |||
|- | |||
| Reals || Distal || positive || 6 || 2 || 3 || 3 ± 2 (-+5,++1) | |||
|- | |||
| Randoms || Distal || arbitrary positive || 12 || 10 || 1.2 || 1.25 | |||
|- | |||
| Randoms || Distal || alternate positive || 13 || 10 || 1.3 || 1.25 | |||
|} | |||
Comparison: | |||
The occurrences of real GCC box proximals and negative distals are greater than the randoms and the positive distals are outside the randoms. This suggests that the real GCC boxes are likely active or activable. | |||
GCC boxes occur in the | |||
# AGC boxes: "The GCC box, also referred to as the '''AGC box''' (10), GCC element (11), or AGCCGCC sequence (13), is an ethylene-responsive element found in the promoters of a large number of [pathogenesis related] PR genes whose expression is up-regulated following pathogen attack."<ref name=Buttner>{{ cite journal | |||
|author=Michael Büttner and Karam B. Singh | |||
|title=''Arabidopsis thaliana'' ethylene-responsive element binding protein (AtEBP), an ethylene-inducible, GCC box DNA-binding protein interacts with an ocs element binding protein | |||
|journal=Proceedings of the National Academy of Sciences of the United States of America | |||
|date=May 27, 1997 | |||
|volume=94 | |||
|issue=11 | |||
|pages=5961-6 | |||
|url=http://www.pnas.org/content/94/11/5961.long | |||
|arxiv= | |||
|bibcode= | |||
|doi= | |||
|pmid= | |||
|accessdate=2014-05-02 }}</ref> | |||
# DNA damage response elements (DRE) (Sumrada, core): "A consensus sequence, 5'-TAGCCGCCGRRRR-3' (where R = an unspecified purine nucleoside [A/G],was generated from these data."<ref name=Sumrada>{{ cite journal | |||
|author=Roberta A. Sumrada and Terrance G. Cooper | |||
|title=Ubiquitous upstream repression sequences control activation of the inducible arginase gene in yeast | |||
|journal=Proceedings of the National Academy of Sciences USA | |||
|date=June 1987 | |||
|volume=84 | |||
|issue= | |||
|pages=3997-4001 | |||
|url=https://www.ncbi.nlm.nih.gov/pmc/articles/PMC305008/pdf/pnas00277-0054.pdf | |||
|arxiv= | |||
|bibcode= | |||
|doi=10.1073/pnas.84.12.3997 | |||
|pmid=3295874 | |||
|accessdate=6 September 2020 }}</ref> | |||
# GGC triplets: "The transcription factors Uga3, Dal81 and Leu3 belong to the class III family (Zn(II)<sub>2</sub>Cys<sub>6</sub> proteins), and they recognize highly related sequences rich in GGC triplets [15]."<ref name=Ruiz>{{ cite journal | |||
|author=Marcos Palavecino-Ruiz, Mariana Bermudez-Moretti and Susana Correa-Garcia | |||
|title=Unravelling the transcriptional regulation of ''Saccharomyces cerevisiae UGA'' genes: the dual role of transcription factor Leu3 | |||
|journal=Microbiology | |||
|date=12 October 2017 | |||
|volume=163 | |||
|issue= | |||
|pages=1692-1701 | |||
|url=https://www.researchgate.net/profile/Mariana-Bermudez-2/publication/320571623_Unravelling_the_transcriptional_regulation_of_Saccharomyces_cerevisiae_UGA_genes_the_dual_role_of_transcription_factor_Leu3/links/5c62114c299bf1d14cbf7ade/Unravelling-the-transcriptional-regulation-of-Saccharomyces-cerevisiae-UGA-genes-the-dual-role-of-transcription-factor-Leu3.pdf | |||
|arxiv= | |||
|bibcode= | |||
|doi=10.1099/mic.0.000560 | |||
|pmid= | |||
|accessdate=20 April 2021 }}</ref> | |||
# Kozak sequences: GCCGCC(A/G)CCATGG.<ref name=Kozak1987>{{ cite journal | |||
|author=Kozak Marilyn | |||
|date=October 1987 | |||
|title=An analysis of 5'-noncoding sequences from 699 vertebrate messenger RNAs | |||
|url=http://nar.oxfordjournals.org/cgi/pmidlookup?view=long&pmid=3313277 | |||
|journal=Nucleic Acids Research | |||
|volume=15 | |||
|issue=20 | |||
|pages=8125–8148 | |||
|doi=10.1093/nar/15.20.8125 | |||
|pmid=3313277 }}</ref> | |||
==Ethylene signaling pathway== | |||
"The GCC box, also referred to as the '''AGC box''' (10), GCC element (11), or AGCCGCC sequence (13), is an ethylene-responsive element found in the promoters of a large number of [pathogenesis related] PR genes whose expression is up-regulated following pathogen attack."<ref name=Buttner/> | |||
In ''Arabidopsis thaliana'' "an ethylene-inducible, GCC box DNA-binding protein interacts with an ocs element binding protein".<ref name=Buttner/> | |||
"Enhancer activity, ethylene responsiveness, and binding of nuclear proteins depend on the integrity of two copies of the AGC box, AGCCGCC, present in the promoters of several ethylene-responsive genes."<ref name=Metzger/> | |||
"cDNA clones have been identified representing 4 novel DNA-binding proteins, called ethylene-responsive element binding proteins (EREBPs), that specifically bind the ERE AGC box".<ref name=Metzger/> | |||
The osmotin-like protein (OLP) "has no intron and ... its promoter region contains two AGCCGCC sequences that are conserved in most basic PR-protein genes."<ref name=Sato/> | |||
The "AGCCGCC sequence(s) is a DNA element(s) responsive to ethylene. An EREBP2 protein, isolated as one of the proteins binding the AGCCGCC sequence of the tobacco rβ-1,3-glucanase gene, also was found to bind to the AGCCGCC sequence(s) of OLP gene. These results suggest that the ethylene-induced expression of OLP is regulated by trans-acting factor(s) common to basic PR-proteins."<ref name=Sato/> | |||
Evidence has been provided "that SAP18 and HDA1 function as transcriptional repressors. [Further] they associate with Ethylene-Responsive Element binding Factors (ERFs) to create a hormone-sensitive multimeric repressor complex under conditions of environmental stress."<ref name=Song/> | |||
"At the molecular level, the actions of ethylene upon gene expression involve Ethylene Responsive element binding Factors (ERFs), which display GCC box-specific binding activities in ''Arabidopsis'' (Ohme-Takagi and Shinshi, 1995). ERFs contain a highly conserved DNA binding domain (the EFR domain) consisting of 58-59 amino acids (Ohme-Takagi and Shinshi, 1995), which binds with high affinity to the GCC box (Hao ''et al.'', 1998)."<ref name=Song/> | |||
"An AGC box (AGCCGCC) was found [from peach (''Prunus persica'' L. Batsch cv. Loring)] between 886 and 892 bp upstream of the translation start site which has been shown in other ethylene-responsive PR genes to be a binding site for ethylene-responsive binding factor proteins (ERF proteins) (Ohme-Takagi and Shinshi, 1995; Sato et al., 1996; Jia and Martin, 1999; Fujimoto et al., 2000)."<ref name=Moon/> | |||
"The peach ACO1 does have an AGC box that has been found to bind ethylene responsive elements in response to pathogen infections (Ohme-Takagi et al., 2000; Rushton et al., 2002). Only the apple ACO1 also contains this sequence. In addition, both PpACO1 and the apple ACO1 have a MADS box transcription factor binding site (CarG) (Tilly et al., 1998), but none of the other ACO genes do."<ref name=Moon/> | |||
==Acknowledgements== | ==Acknowledgements== | ||
Line 391: | Line 690: | ||
==External links== | ==External links== | ||
* [http://www.genome.jp/ GenomeNet KEGG database] | * [http://www.genome.jp/ GenomeNet KEGG database] | ||
* [http://www.ncbi.nlm.nih.gov/sites/entrez?db=gene Home - Gene - NCBI] | * [http://www.ncbi.nlm.nih.gov/sites/entrez?db=gene Home - Gene - NCBI] | ||
* [http://www.ncbi.nlm.nih.gov/sites/gquery NCBI All Databases Search] | * [http://www.ncbi.nlm.nih.gov/sites/gquery NCBI All Databases Search] | ||
* [http://www.ncbi.nlm.nih.gov/ncbisearch/ NCBI Site Search] | * [http://www.ncbi.nlm.nih.gov/ncbisearch/ NCBI Site Search] | ||
* [http://www.osti.gov/ Office of Scientific & Technical Information] | * [http://www.osti.gov/ Office of Scientific & Technical Information] | ||
* [http://www.ncbi.nlm.nih.gov/pccompound PubChem Public Chemical Database] | * [http://www.ncbi.nlm.nih.gov/pccompound PubChem Public Chemical Database] | ||
* [http://www.scirus.com/srsapp/advanced/index.jsp?q1= Scirus for scientific information only advanced search] | * [http://www.scirus.com/srsapp/advanced/index.jsp?q1= Scirus for scientific information only advanced search] | ||
<!-- footer templates --> | <!-- footer templates --> | ||
Line 416: | Line 702: | ||
<!-- categories --> | <!-- categories --> | ||
Latest revision as of 16:35, 29 August 2023
Editor-In-Chief: Henry A. Hoff
"The GCC box, also referred to as the AGC box (10), GCC element (11), or AGCCGCC sequence (13), is an ethylene-responsive element found in the promoters of a large number of [pathogenesis related] PR genes whose expression is up-regulated following pathogen attack."[1]
Consensus sequences
The AGC box has a consensus sequence as 3'-AGCCGCC-5' in the direction of transcription.[2]
AGC
"AGC is a binding site for factors responding to pathogen attacks (Ohme-Takagi et al., 2000)".[3]
Inverse copies
For "AGC, one copy in inverse orientation of the AGC box (AGCCGCC) [is] present as two copies (-1346 and -1314) in the ERE".[2]
Enhancers
"Enhancer activity, ethylene responsiveness, and binding of nuclear proteins depend on the integrity of two copies of the AGC box, AGCCGCC, present in the promoters of several ethylene-responsive genes."[2]
"The GLB enhancer contains two copies of the sequence AGCCGCC, which is conserved in several genes showing expression patterns similar to the GLB gene, as well as a sequence identical at 6 of 7 bp."[4]
Glucanase promoters
"One common motif, AGCCGCC (AGC box), has been found to be present in nearly all chitinase and glucanase promoters so far analyzed (Ohme-Takagi and Shinshi 1990; Hart et al. 1993)."[5]
DNA-binding proteins
"cDNA clones have been identified representing 4 novel DNA-binding proteins, called ethylene-responsive element binding proteins (EREBPs), that specifically bind the ERE AGC box".[2]
Functional non-coding DNA
Functional "non-coding DNA is involved in the regulation of gene expression and thus in the evolution of novelties and adaptation between species [...] Functional non-coding sequences fall into two main categories: protein binding sites such as transcription factor binding sites (TFBSs), enhancers [such as the AGC box], and silencers, which are involved in the control of gene expression, and sequences that control chromatin organization such as insulators and matrix attachment regions".[6]
"Genes of PR-1 and -5 proteins have now been identified in the genomes of various species of organisms, including humans and nematodes. PR proteins may contribute to the innate immunity of plants as well as to that of other organisms."[7]
Ostreococcus
"Ocean-dwelling phytoplankton from the genus Ostreococcus emerge at the primitive root of the green plant lineage, dating back nearly 1.5 billion years. Today, these microscopic, free-living creatures, among the smallest eukaryotes ever characterized, barely a micron in diameter, contribute to a significant share of the world’s total photosynthetic activity. These “picophytoplankton”also exhibit great diversity that contrasts sharply with the dearth of ecological niches available to them in aquatic ecosystems. This observation, known as the “paradox of the plankton,” has long puzzled biologists."[8]
"Plumbing the depths of molecular-level information of related species, genomics offers a novel glimpse into this paradox. The researchers compared the genomes of two Ostreococcus species, O. lucimarinus and O. tauri, and saw dramatic changes in genome structure and metabolic capabilities."[8]
“We found several striking features of genome organization. Overlapping genes conserved across the species may enable them to cross-regulate their expression, while species-specific chromosomes with horizontally transferred genes can account for changes in the cell surface to adapt to different ecological niches.”[8]
“This work builds on the community’s emerging understanding about how carbon fixation is carried out by picoplankton.”[9]
“From an applied perspective, we are learning some of the tricks nature has employed to ‘engineer’ an extremely small eukaryote to thrive in nature–which may well find applications in bioengineering. It was particularly interesting to see the predicted use of selenium-containing enzymes as one of the tricks to maintain such tiny cells. There are many mechanisms that can account for species formation in photosynthetic phytoplankton, and this is just one of the major pieces to this long-standing puzzle for biologists.”[9]
“Assimilation of atmospheric CO2 by marine phytoplankton is a global-scale process that is responsible for about half of the biosphere net primary production. This active absorption of hundreds of millions of tons of carbon per day is essential for maintaining the control of the planet’s climate by counteracting greenhouse effects due to human activities. Clearly, this storage capacity is affected by changes in the photosynthetic efficiency of the algae, which in turn is linked to the environmental conditions experienced by these organisms in their environment.”[10]
Nicotiana
The osmotin-like protein (OLP) "has no intron and ... its promoter region contains two AGCCGCC sequences that are conserved in most basic PR-protein genes."[11]
The "AGCCGCC sequence(s) is a DNA element(s) responsive to ethylene. An EREBP2 protein, isolated as one of the proteins binding the AGCCGCC sequence of the tobacco rβ-1,3-glucanase gene, also was found to bind to the AGCCGCC sequence(s) of OLP gene. These results suggest that the ethylene-induced expression of OLP is regulated by trans-acting factor(s) common to basic PR-proteins."[11]
"AGCCGCC sequences were found at -46 to -52 and -161 to -167. There was no repeated sequence (-938 to -903)".[11]
"Expression of the osmotin gene is similar to that of the OLP gene. The osmotin gene also has several AGCCGCC sequences; a complete AGCCGCC (from -50 to -44), a slightly modified CGCCGCC (from -144 to -138), and an AGCCGCC sequence in reverse orientation (from -162 to -156)."[11]
Arabidopsis
In Arabidopsis thaliana "an ethylene-inducible, GCC box DNA-binding protein interacts with an ocs element binding protein".[1]
"In yeast and mammalian systems, it is well established that transcriptional down-regulation by DNA-binding repressors involves core histone deacetylation, mediated by their interaction within a complex containing histone deacetylase (e.g. HDA1), as well as various proteins (e.g. SIN3, SAP18, SAP30, and RhAp46). [An] Arabidopsis thaliana gene related in sequence to SAP18, designated AtSAP18, functions in transcription regulation in plants subjected to salt stress."[12]
Evidence has been provided "that SAP18 and HDA1 function as transcriptional repressors. [Further] they associate with Ethylene-Responsive Element binding Factors (ERFs) to create a hormone-sensitive multimeric repressor complex under conditions of environmental stress."[12]
"At the molecular level, the actions of ethylene upon gene expression involve Ethylene Responsive element binding Factors (ERFs), which display GCC box-specific binding activities in Arabidopsis (Ohme-Takagi and Shinshi, 1995). ERFs contain a highly conserved DNA binding domain (the EFR domain) consisting of 58-59 amino acids (Ohme-Takagi and Shinshi, 1995), which binds with high affinity to the GCC box (Hao et al., 1998)."[12]
Peaches
"An AGC box (AGCCGCC) was found [from peach (Prunus persica L. Batsch cv. Loring)] between 886 and 892 bp upstream of the translation start site which has been shown in other ethylene-responsive PR genes to be a binding site for ethylene-responsive binding factor proteins (ERF proteins) (Ohme-Takagi and Shinshi, 1995; Sato et al., 1996; Jia and Martin, 1999; Fujimoto et al., 2000)."[3]
"The peach ACO1 does have an AGC box that has been found to bind ethylene responsive elements in response to pathogen infections (Ohme-Takagi et al., 2000; Rushton et al., 2002). Only the apple ACO1 also contains this sequence. In addition, both PpACO1 and the apple ACO1 have a MADS box transcription factor binding site (CarG) (Tilly et al., 1998), but none of the other ACO genes do."[3]
E2F4
Gene ID: 1874 - "The protein encoded by this gene is a member of the E2F family of transcription factors. The E2F family plays a crucial role in the control of cell cycle and action of tumor suppressor proteins and is also a target of the transforming proteins of small DNA tumor viruses. The E2F proteins contain several evolutionally conserved domains found in most members of the family. These domains include a DNA binding domain, a dimerization domain which determines interaction with the differentiation regulated transcription factor proteins (DP), a transactivation domain enriched in acidic amino acids, and a tumor suppressor protein association domain which is embedded within the transactivation domain. This protein binds to all three of the tumor suppressor proteins pRB, p107 and p130, but with higher affinity to the last two. It plays an important role in the suppression of proliferation-associated genes, and its gene mutation and increased expression may be associated with human cancer."[13]
"The AGC triplet repeat in the coding region of the E2F-4 gene, a member of the family, has been reported to be mutated in colorectal cancers with a microsatellite instability (MSI) phenotype. We found a wider range variation of the repeat number in DNAs from tumors, the corresponding normal mucosa, and healthy individuals. A total of 5 repeat variants, ranging from 8 to 17 AGC repeats, was detected in 6 (9.7%) of the 62 healthy individuals and 8 (8.9%) of the 90 normal DNAs of the patients. The wild-type 13 repeat was present in all of these individuals. The variation of the AGC repeat number may be a polymorphism. Further, loss of heterozygosity (LOH) at the E2F-4 locus in the tumor tissues of 2 (25%) of the 8 informative cases was detected."[14]
Hypotheses
- An AGC box occurs in the human genome.
AGC box samplings
For the Basic programs (starting with SuccessablesAGC.bas) written to compare nucleotide sequences with the sequences on either the template strand (-), or coding strand (+), of the DNA, in the negative direction (-), or the positive direction (+), including extending the number of nts from 958 to 4445, the programs are, are looking for, and found:
- negative strand in the negative direction is SuccessablesAGC--.bas, looking for AGCCGCC, 0,
- negative strand in the positive direction is SuccessablesAGC-+.bas, looking for AGCCGCC, 0,
- positive strand in the negative direction is SuccessablesAGC+-.bas, looking for AGCCGCC, 0,
- positive strand in the positive direction is SuccessablesAGC++.bas, looking for AGCCGCC, 0,
- inverse complement, negative strand, negative direction is SuccessablesAGCci--.bas, looking for GGCGGCT: 0.
- inverse complement, negative strand, positive direction is SuccessablesAGCci-+.bas, looking for GGCGGCT: 0.
- inverse complement, positive strand, negative direction is SuccessablesAGCci+-.bas, looking for GGCGGCT: 1, GGCGGCT at 1754.
- inverse complement, positive strand, positive direction is SuccessablesAGCci++.bas, looking for GGCGGCT: 0.
AGCbox negative direction (2596-1) distal promoters
- Positive strand, negative direction: GGCGGCT at 1754.
AGC random dataset samplings
- AGCr0: 1, AGCCGCC at 2380.
- AGCr1: 0.
- AGCr2: 0.
- AGCr3: 2, AGCCGCC at 4138, AGCCGCC at 1452.
- AGCr4: 1, AGCCGCC at 80.
- AGCr5: 1, AGCCGCC at 4353.
- AGCr6: 0.
- AGCr7: 0.
- AGCr8: 0.
- AGCr9: 1, AGCCGCC at 2449.
- AGCr0ci: 1, GGCGGCT at 3548.
- AGCr1ci: 0.
- AGCr2ci: 1, GGCGGCT at 4349.
- AGCr3ci: 1, GGCGGCT at 1443.
- AGCr4ci: 1, GGCGGCT at 4110.
- AGCr5ci: 0.
- AGCr6ci: 0.
- AGCr7ci: 0.
- AGCr8ci: 0.
- AGCr9ci: 0.
AGCr arbitrary (evens) (4560-2846) UTRs
- AGCr0ci: GGCGGCT at 3548.
- AGCr2ci: GGCGGCT at 4349.
- AGCr4ci: GGCGGCT at 4110.
AGCr alternate (odds) (4560-2846) UTRs
- AGCr3: AGCCGCC at 4138.
- AGCr5: AGCCGCC at 4353.
AGCr arbitrary positive direction (odds) (4445-4265) core promoters
- AGCr5: AGCCGCC at 4353.
AGCr alternate positive direction (evens) (4445-4265) core promoters
- AGCr2ci: GGCGGCT at 4349.
AGCr arbitrary positive direction (odds) (4265-4050) proximal promoters
- AGCr3: AGCCGCC at 4138.
AGCr alternate positive direction (evens) (4265-4050) proximal promoters
- AGCr4ci: GGCGGCT at 4110.
AGCr arbitrary negative direction (evens) (2596-1) distal promoters
- AGCr0: AGCCGCC at 2380.
- AGCr4: AGCCGCC at 80.
AGCr alternate negative direction (odds) (2596-1) distal promoters
- AGCr3: AGCCGCC at 1452.
- AGCr9: AGCCGCC at 2449.
- AGCr3ci: GGCGGCT at 1443.
AGCr arbitrary positive direction (odds) (4050-1) distal promoters
- AGCr3: AGCCGCC at 1452.
- AGCr9: AGCCGCC at 2449.
- AGCr3ci: GGCGGCT at 1443.
AGCr alternate positive direction (evens) (4050-1) distal promoters
- AGCr0: AGCCGCC at 2380.
- AGCr4: AGCCGCC at 80.
- AGCr0ci: GGCGGCT at 3548.
AGC box analysis and results
"An AGC box (AGCCGCC) was found [from peach (Prunus persica L. Batsch cv. Loring)] between 886 and 892 bp upstream of the translation start site which has been shown in other ethylene-responsive PR genes to be a binding site for ethylene-responsive binding factor proteins (ERF proteins) (Ohme-Takagi and Shinshi, 1995; Sato et al., 1996; Jia and Martin, 1999; Fujimoto et al., 2000)."[3]
Reals or randoms | Promoters | direction | Numbers | Strands | Occurrences | Averages (± 0.1) |
---|---|---|---|---|---|---|
Reals | UTR | negative | 0 | 2 | 0 | 0 |
Randoms | UTR | arbitrary negative | 3 | 10 | 0.3 | 0.25 |
Randoms | UTR | alternate negative | 2 | 10 | 0.2 | 0.25 |
Reals | Core | negative | 0 | 2 | 0 | 0 |
Randoms | Core | arbitrary negative | 0 | 10 | 0 | 0 |
Randoms | Core | alternate negative | 0 | 10 | 0 | 0 |
Reals | Core | positive | 0 | 2 | 0 | 0 |
Randoms | Core | arbitrary positive | 1 | 10 | 0.1 | 0.1 |
Randoms | Core | alternate positive | 1 | 10 | 0.1 | 0.1 |
Reals | Proximal | negative | 0 | 2 | 0 | 0 |
Randoms | Proximal | arbitrary negative | 0 | 10 | 0 | 0 |
Randoms | Proximal | alternate negative | 0 | 10 | 0 | 0 |
Reals | Proximal | positive | 0 | 2 | 0 | 0 |
Randoms | Proximal | arbitrary positive | 1 | 10 | 0.1 | 0.1 |
Randoms | Proximal | alternate positive | 1 | 10 | 0.1 | 0.1 |
Reals | Distal | negative | 1 | 2 | 0.5 | 0.5 |
Randoms | Distal | arbitrary negative | 2 | 10 | 0.2 | 0.25 |
Randoms | Distal | alternate negative | 3 | 10 | 0.3 | 0.25 |
Reals | Distal | positive | 0 | 2 | 0 | 0 |
Randoms | Distal | arbitrary positive | 3 | 10 | 0.3 | 0.3 |
Randoms | Distal | alternate positive | 3 | 10 | 0.3 | 0.3 |
Comparison:
The occurrence of a real AGC box is greater than the randoms. This suggests that the real AGC box is likely active or activable.
GCC box samplings
Copying GCCGCC in "⌘F" yields one between ZSCAN22 and A1BG and two between ZNF497 and A1BG as can be found by the computer programs.
For the Basic programs (starting with SuccessablesGCC.bas) written to compare nucleotide sequences with the sequences on either the template strand (-), or coding strand (+), of the DNA, in the negative direction (-), or the positive direction (+), including extending the number of nts from 958 to 4445, the programs are, are looking for, and found:
- negative strand, negative direction, looking for GCCGCC, 1, GCCGCC at 2727.
- positive strand, negative direction, looking for GCCGCC, 0.
- negative strand, positive direction, looking for GCCGCC, 2, GCCGCC at 1757, GCCGCC at 904.
- positive strand, positive direction, looking for GCCGCC, 1, GCCGCC at 356.
- inverse complement, negative strand, negative direction, looking for GGCGGC, 0.
- inverse complement, positive strand, negative direction, looking for GGCGGC, 1, GGCGGC at 1753.
- inverse complement, positive strand, positive direction, looking for GGCGGC, 0.
- inverse complement, negative strand, positive direction, looking for GGCGGC, 3, GGCGGC at 1902, GGCGGC at 1794, GGCGGC at 354.
AGC negative direction (2811-2596) proximal promoters
- Negative strand, negative direction: GCCGCC at 2727.
AGC negative direction (2596-1) distal promoters
- Positive strand, negative direction: GGCGGC at 1753.
AGC positive direction (4050-1) distal promoters
- Negative strand, positive direction: GCCGCC at 1757, GCCGCC at 904.
- Negative strand, positive direction: GGCGGC at 1902, GGCGGC at 1794, GGCGGC at 354.
- Positive strand, positive direction: GCCGCC at 356.
GCC random dataset samplings
- GCCr0: 3, GCCGCC at 3407, GCCGCC at 2380, GCCGCC at 1384.
- GCCr1: 0.
- GCCr2: 3, GCCGCC at 3586, GCCGCC at 2598, GCCGCC at 1966.
- GCCr3: 3, GCCGCC at 4138, GCCGCC at 2792, GCCGCC at 1452.
- GCCr4: 4, GCCGCC at 1092, GCCGCC at 1089, GCCGCC at 1022, GCCGCC at 80.
- GCCr5: 1, GCCGCC at 4353.
- GCCr6: 0.
- GCCr7: 1, GCCGCC at 1770.
- GCCr8: 2, GCCGCC at 2518, GCCGCC at 2473.
- GCCr9: 3, GCCGCC at 2666, GCCGCC at 2449, GCCGCC at 1415.
- GCCr0ci: 1, GGCGGC at 3547.
- GCCr1ci: 0.
- GCCr2ci: 1, GGCGGC at 4348.
- GCCr3ci: 1, GGCGGC at 1442.
- GCCr4ci: 1, GGCGGC at 4109.
- GCCr5ci: 2, GGCGGC at 2932, GGCGGC at 678.
- GCCr6ci: 1, GGCGGC at 4434.
- GCCr7ci: 0.
- GCCr8ci: 1, GGCGGC at 4280.
- GCCr9ci: 3, GGCGGC at 3896, GGCGGC at 3628, GGCGGC at 1727.
GCCr arbitrary (evens) (4560-2846) UTRs
- GCCr0: GCCGCC at 3407.
- GCCr2: GCCGCC at 3586.
- GCCr0ci: GGCGGC at 3547.
- GCCr2ci: GGCGGC at 4348.
- GCCr4ci: GGCGGC at 4109.
- GCCr6ci: GGCGGC at 4434.
- GCCr8ci: GGCGGC at 4280.
GCCr alternate (odds) (4560-2846) UTRs
- GCCr3: GCCGCC at 4138.
- GCCr5: GCCGCC at 4353.
- GCCr5ci: GGCGGC at 2932.
- GCCr9ci: GGCGGC at 3896, GGCGGC at 3628.
GCCr arbitrary positive direction (odds) (4445-4265) core promoters
- GCCr5: GCCGCC at 4353.
GCCr alternate positive direction (evens) (4445-4265) core promoters
- GCCr2ci: GGCGGC at 4348.
- GCCr6ci: GGCGGC at 4434.
- GCCr8ci: GGCGGC at 4280.
GCCr arbitrary negative direction (evens) (2811-2596) proximal promoters
- GCCr2: GCCGCC at 2598.
GCCr alternate negative direction (odds) (2811-2596) proximal promoters
- GCCr3: GCCGCC at 2792.
- GCCr9: GCCGCC at 2666.
GCCr arbitrary positive direction (odds) (4265-4050) proximal promoters
- GCCr3: GCCGCC at 4138.
GCCr alternate positive direction (evens) (4265-4050) proximal promoters
- GCCr4ci: GGCGGC at 4109.
GCCr arbitrary negative direction (evens) (2596-1) distal promoters
- GCCr0: GCCGCC at 2380, GCCGCC at 1384.
- GCCr2: GCCGCC at 1966.
- GCCr4: GCCGCC at 1092, GCCGCC at 1089, GCCGCC at 1022, GCCGCC at 80.
- GCCr8: GCCGCC at 2518, GCCGCC at 2473.
GCCr alternate negative direction (odds) (2596-1) distal promoters
- GCCr3: GCCGCC at 1452.
- GCCr7: GCCGCC at 1770.
- GCCr9: GCCGCC at 2449, GCCGCC at 1415.
- GCCr3ci: GGCGGC at 1442.
- GCCr5ci: GGCGGC at 678.
- GCCr9ci: GGCGGC at 1727.
GCCr arbitrary positive direction (odds) (4050-1) distal promoters
- GCCr3: GCCGCC at 2792, GCCGCC at 1452.
- GCCr7: GCCGCC at 1770.
- GCCr9: GCCGCC at 2666, GCCGCC at 2449, GCCGCC at 1415.
- GCCr3ci: GGCGGC at 1442.
- GCCr5ci: GGCGGC at 2932, GGCGGC at 678.
- GCCr9ci: GGCGGC at 3896, GGCGGC at 3628, GGCGGC at 1727.
GCCr alternate positive direction (evens) (4050-1) distal promoters
- GCCr0: GCCGCC at 3407, GCCGCC at 2380, GCCGCC at 1384.
- GCCr2: GCCGCC at 3586, GCCGCC at 2598, GCCGCC at 1966.
- GCCr4: GCCGCC at 1092, GCCGCC at 1089, GCCGCC at 1022, GCCGCC at 80.
- GCCr8: GCCGCC at 2518, GCCGCC at 2473.
- GCCr0ci: GGCGGC at 3547.
GCC box analysis and results
"Expression of the osmotin gene is similar to that of the OLP gene. The osmotin gene also has several AGCCGCC sequences; a complete AGCCGCC (from -50 to -44), a slightly modified CGCCGCC (from -144 to -138), and an AGCCGCC sequence in reverse orientation (from -162 to -156)."[11]
Reals or randoms | Promoters | direction | Numbers | Strands | Occurrences | Averages (± 0.1) |
---|---|---|---|---|---|---|
Reals | UTR | negative | 0 | 2 | 0 | 0 |
Randoms | UTR | arbitrary negative | 7 | 10 | 0.7 | 0.6 |
Randoms | UTR | alternate negative | 5 | 10 | 0.5 | 0.6 |
Reals | Core | negative | 0 | 2 | 0 | 0 |
Randoms | Core | arbitrary negative | 0 | 10 | 0 | 0 |
Randoms | Core | alternate negative | 0 | 10 | 0 | 0 |
Reals | Core | positive | 0 | 2 | 0 | 0 |
Randoms | Core | arbitrary positive | 1 | 10 | 0.1 | 0 |
Randoms | Core | alternate positive | 3 | 10 | 0.3 | 0 |
Reals | Proximal | negative | 1 | 2 | 0.5 | 0.5 ± 0.5 (--1,+-0) |
Randoms | Proximal | arbitrary negative | 1 | 10 | 0.1 | 0.15 |
Randoms | Proximal | alternate negative | 2 | 10 | 0.2 | 0.15 |
Reals | Proximal | positive | 0 | 2 | 0 | 0 |
Randoms | Proximal | arbitrary positive | 1 | 10 | 0.1 | 0.1 |
Randoms | Proximal | alternate positive | 1 | 10 | 0.1 | 0.1 |
Reals | Distal | negative | 1 | 2 | 0.5 | 0.5 ± 0.5 (--0,+-1) |
Randoms | Distal | arbitrary negative | 9 | 10 | 0.9 | 0.8 |
Randoms | Distal | alternate negative | 7 | 10 | 0.7 | 0.8 |
Reals | Distal | positive | 6 | 2 | 3 | 3 ± 2 (-+5,++1) |
Randoms | Distal | arbitrary positive | 12 | 10 | 1.2 | 1.25 |
Randoms | Distal | alternate positive | 13 | 10 | 1.3 | 1.25 |
Comparison:
The occurrences of real GCC box proximals and negative distals are greater than the randoms and the positive distals are outside the randoms. This suggests that the real GCC boxes are likely active or activable.
GCC boxes occur in the
- AGC boxes: "The GCC box, also referred to as the AGC box (10), GCC element (11), or AGCCGCC sequence (13), is an ethylene-responsive element found in the promoters of a large number of [pathogenesis related] PR genes whose expression is up-regulated following pathogen attack."[1]
- DNA damage response elements (DRE) (Sumrada, core): "A consensus sequence, 5'-TAGCCGCCGRRRR-3' (where R = an unspecified purine nucleoside [A/G],was generated from these data."[15]
- GGC triplets: "The transcription factors Uga3, Dal81 and Leu3 belong to the class III family (Zn(II)2Cys6 proteins), and they recognize highly related sequences rich in GGC triplets [15]."[16]
- Kozak sequences: GCCGCC(A/G)CCATGG.[17]
Ethylene signaling pathway
"The GCC box, also referred to as the AGC box (10), GCC element (11), or AGCCGCC sequence (13), is an ethylene-responsive element found in the promoters of a large number of [pathogenesis related] PR genes whose expression is up-regulated following pathogen attack."[1]
In Arabidopsis thaliana "an ethylene-inducible, GCC box DNA-binding protein interacts with an ocs element binding protein".[1]
"Enhancer activity, ethylene responsiveness, and binding of nuclear proteins depend on the integrity of two copies of the AGC box, AGCCGCC, present in the promoters of several ethylene-responsive genes."[2]
"cDNA clones have been identified representing 4 novel DNA-binding proteins, called ethylene-responsive element binding proteins (EREBPs), that specifically bind the ERE AGC box".[2]
The osmotin-like protein (OLP) "has no intron and ... its promoter region contains two AGCCGCC sequences that are conserved in most basic PR-protein genes."[11]
The "AGCCGCC sequence(s) is a DNA element(s) responsive to ethylene. An EREBP2 protein, isolated as one of the proteins binding the AGCCGCC sequence of the tobacco rβ-1,3-glucanase gene, also was found to bind to the AGCCGCC sequence(s) of OLP gene. These results suggest that the ethylene-induced expression of OLP is regulated by trans-acting factor(s) common to basic PR-proteins."[11]
Evidence has been provided "that SAP18 and HDA1 function as transcriptional repressors. [Further] they associate with Ethylene-Responsive Element binding Factors (ERFs) to create a hormone-sensitive multimeric repressor complex under conditions of environmental stress."[12]
"At the molecular level, the actions of ethylene upon gene expression involve Ethylene Responsive element binding Factors (ERFs), which display GCC box-specific binding activities in Arabidopsis (Ohme-Takagi and Shinshi, 1995). ERFs contain a highly conserved DNA binding domain (the EFR domain) consisting of 58-59 amino acids (Ohme-Takagi and Shinshi, 1995), which binds with high affinity to the GCC box (Hao et al., 1998)."[12]
"An AGC box (AGCCGCC) was found [from peach (Prunus persica L. Batsch cv. Loring)] between 886 and 892 bp upstream of the translation start site which has been shown in other ethylene-responsive PR genes to be a binding site for ethylene-responsive binding factor proteins (ERF proteins) (Ohme-Takagi and Shinshi, 1995; Sato et al., 1996; Jia and Martin, 1999; Fujimoto et al., 2000)."[3]
"The peach ACO1 does have an AGC box that has been found to bind ethylene responsive elements in response to pathogen infections (Ohme-Takagi et al., 2000; Rushton et al., 2002). Only the apple ACO1 also contains this sequence. In addition, both PpACO1 and the apple ACO1 have a MADS box transcription factor binding site (CarG) (Tilly et al., 1998), but none of the other ACO genes do."[3]
Acknowledgements
The content on this page was first contributed by: Henry A. Hoff.
Initial content for this page in some instances came from Wikiversity.
See also
References
- ↑ 1.0 1.1 1.2 1.3 1.4 Michael Büttner and Karam B. Singh (May 27, 1997). "Arabidopsis thaliana ethylene-responsive element binding protein (AtEBP), an ethylene-inducible, GCC box DNA-binding protein interacts with an ocs element binding protein". Proceedings of the National Academy of Sciences of the United States of America. 94 (11): 5961–6. Retrieved 2014-05-02.
- ↑ 2.0 2.1 2.2 2.3 2.4 2.5 Gerhard Leubner-Metzger, Luciana Petruzzelli, Rosa Waldvogel, Regina Vögeli-Lange, and Frederick Meins, Jr. (November 1998). "Ethylene-responsive element binding protein (EREBP) expression and the transcriptional regulation of class I β-1, 3-glucanase during tobacco seed germination". Plant Molecular Biology. 38 (5): 785–95. doi:10.1023/A:1006040425383. Retrieved 2014-05-02.
- ↑ 3.0 3.1 3.2 3.3 3.4 3.5 Hangsik Moon and Ann M. Callahan (2004). "Developmental regulation of peach ACC oxidase promoter–GUS fusions in transgenic tomato fruits". Journal of Experimental Botany. 55 (402): 1519–28. doi:10.1093/jxb/erh162. Retrieved 2014-05-07.
- ↑ CM Hart, F. Nagy, and F. Meins Jr. (January 1993). "A 61 bp enhancer element of the tobacco beta-1,3-glucanase B gene interacts with one or more regulated nuclear proteins". Plant Molecular Biology. 21 (1): 121–31. PMID 8425042. Retrieved 2014-05-02.
- ↑ Imre E. Somssich (1994). L. Nover, ed. Regulatory Elements Governing Pathogenesis-Related (PR) Gene Expression, In: Plant Promoters and Transcription Factors. 20. Berlin: Springer-Verlag. pp. 163–79. doi:10.1007/978-3-540-48037-2_7. Retrieved 2014-05-07.
- ↑ Gwenael Piganeau, Klaas Vandepoele, Sébastien Gourbière, Yves Van de Peer, and Hervé Moreau (September 2009). "Unraveling cis-Regulatory Elements in the Genome of the Smallest Photosynthetic Eukaryote: Phylogenetic Footprinting in Ostreococcus". Journal of Molecular Evolution. 69 (3): 249–59. doi:10.1007/s00239-009-927I-0. Retrieved 2014-05-02.
- ↑ Sakihito Kitajima and Fumihiko Sato (1999). "Plant pathogenesis-related proteins: molecular mechanisms of gene expression and protein function". Journal of Biochemistry. 125 (1): 1–8. Retrieved 2016-01-07.
- ↑ 8.0 8.1 8.2 Igor Grigoriev (April 30, 2007). Puzzling Plankton Yield Secrets to Role in Evolution/Global Photosynthesis. Washington, DC USA: Department of Energy. Retrieved 2014-05-06.
- ↑ 9.0 9.1 Brian Palenik (April 30, 2007). Puzzling Plankton Yield Secrets to Role in Evolution/Global Photosynthesis. Washington, DC USA: Department of Energy. Retrieved 2014-05-06.
- ↑ Hervé Moreau (April 30, 2007). Puzzling Plankton Yield Secrets to Role in Evolution/Global Photosynthesis. Washington, DC USA: Department of Energy. Retrieved 2014-05-06.
- ↑ 11.0 11.1 11.2 11.3 11.4 11.5 11.6 Fumihiko Sato, Sakihito Kitajima and Tomotsugu Koyama (1996). "Ethylene-Induced Gene Expression of Osmotin-Like Protein, a Neutral Isoform of Tobacco PR-5, is Mediated by the AGCCGCC eft-Sequence". Plant and Cell Physiology. 37 (3): 249–55. Retrieved 2014-05-07.
- ↑ 12.0 12.1 12.2 12.3 12.4 Chun-Peng Song and David W. Galbraith (January 2006). "AtSAP18, an orthologue of human SAP18, is involved in the regulation of salt stress and mediates transcriptional repression in Arabidopsis". Plant Molecular Biology. 60 (2): 241–57. doi:10.1007/s11103-005-3880-9. Retrieved 2016-01-07.
- ↑ RefSeqJuly2008 (25 December 2016). E2F4 E2F transcription factor 4 [ Homo sapiens (human) ]. U.S. National Library of Medicine, 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information. Retrieved 2017-01-08.
- ↑ X. Zhong, H. Hemmi, J. Koike, K. Tsujita, H. Shimatake (March 2000). "Various AGC repeat numbers in the coding region of the human transcription factor gene E2F-4". Human Mutation. 15 (3): 296–7. doi:10.1002/(SICI)1098-1004(200003)15:3<296::AID-HUMU18>3.0.CO;2-X. PMID 10679953. Retrieved 2017-01-08.
- ↑ Roberta A. Sumrada and Terrance G. Cooper (June 1987). "Ubiquitous upstream repression sequences control activation of the inducible arginase gene in yeast" (PDF). Proceedings of the National Academy of Sciences USA. 84: 3997–4001. doi:10.1073/pnas.84.12.3997. PMID 3295874. Retrieved 6 September 2020.
- ↑ Marcos Palavecino-Ruiz, Mariana Bermudez-Moretti and Susana Correa-Garcia (12 October 2017). "Unravelling the transcriptional regulation of Saccharomyces cerevisiae UGA genes: the dual role of transcription factor Leu3" (PDF). Microbiology. 163: 1692–1701. doi:10.1099/mic.0.000560. Retrieved 20 April 2021.
- ↑ Kozak Marilyn (October 1987). "An analysis of 5'-noncoding sequences from 699 vertebrate messenger RNAs". Nucleic Acids Research. 15 (20): 8125–8148. doi:10.1093/nar/15.20.8125. PMID 3313277.
Further reading
- Gwenael Piganeau, Klaas Vandepoele, Sébastien Gourbière, Yves Van de Peer, and Hervé Moreau (September 2009). "Unraveling cis-Regulatory Elements in the Genome of the Smallest Photosynthetic Eukaryote: Phylogenetic Footprinting in Ostreococcus". Journal of Molecular Evolution. 69 (3): 249–59. doi:10.1007/s00239-009-927I-0. Retrieved 2014-05-02.
External links
- GenomeNet KEGG database
- Home - Gene - NCBI
- NCBI All Databases Search
- NCBI Site Search
- Office of Scientific & Technical Information
- PubChem Public Chemical Database
- Scirus for scientific information only advanced search