Downstream promoter element gene transcriptions: Difference between revisions
Line 411: | Line 411: | ||
! Reals or randoms !! Promoters !! direction !! Numbers !! Strands !! Occurrences !! Averages (± 0.1) | ! Reals or randoms !! Promoters !! direction !! Numbers !! Strands !! Occurrences !! Averages (± 0.1) | ||
|- | |- | ||
| Reals || UTR || negative || | | Reals || UTR || negative || 4 || 2 || 2 || 2 ± 1 (--3,+-1) | ||
|- | |- | ||
| Randoms || UTR || arbitrary negative || 0 || 10 || 0 || 0 | | Randoms || UTR || arbitrary negative || 0 || 10 || 0 || 0 | ||
Line 441: | Line 441: | ||
| Randoms || Proximal || alternate positive || 0 || 10 || 0 || 0 | | Randoms || Proximal || alternate positive || 0 || 10 || 0 || 0 | ||
|- | |- | ||
| Reals || Distal || negative || | | Reals || Distal || negative || 9 || 2 || 4.5 || 4.5 ± 0.5 (--5,+-4) | ||
|- | |- | ||
| Randoms || Distal || arbitrary negative || 0 || 10 || 0 || 0 | | Randoms || Distal || arbitrary negative || 0 || 10 || 0 || 0 | ||
Line 447: | Line 447: | ||
| Randoms || Distal || alternate negative || 0 || 10 || 0 || 0 | | Randoms || Distal || alternate negative || 0 || 10 || 0 || 0 | ||
|- | |- | ||
| Reals || Distal || positive || | | Reals || Distal || positive || 9 || 2 || 4.5 || 4.5 ± 3.5 (-+1,++8) | ||
|- | |- | ||
| Randoms || Distal || arbitrary positive || 0 || 10 || 0 || 0 | | Randoms || Distal || arbitrary positive || 0 || 10 || 0 || 0 |
Revision as of 19:51, 16 July 2022
Editor-In-Chief: Henry A. Hoff
The figure on the right is an overview of four core promoter elements: the B recognition element (BRE), TATA box, initiator element (Inr), and downstream promoter element (DPE), showing their respective consensus sequences and their distance from the transcription start site.[1]
The downstream promoter element (DPE) is a core promoter element present in other species including humans and excluding Saccharomyces cerevisiae.[2]
Gene transcriptions
"Transcription by RNA polymerase II is directed by cis-acting [close-acting] DNA sequences that typically consist of a core promoter along with regulatory elements, such as enhancers [trans-acting, or distant-acting, protein factors], that contain binding sites for sequence-specific transcriptional activator and/or repressor proteins."[3]
Core promoters
"[T]he core promoter [consists of] the DNA sequences, which encompass the transcription start site (within about -40 and +40 [nucleotides] relative to the +1 start site"[3].
"[T]he core sequence of the DPE is located at precisely +28 to +32 relative to the A+1 nucleotide in the Inr"[4]. It is located about 28–33 nucleotides downstream of the transcription start site.[2]
DPE-dependent basal transcription depends highly on the Inr (and vice versa) and on correct spacing between the two elements.[5][3][6]
Initiator elements
"There is a strict requirement for spacing between the [Initiator element] Inr and DPE motifs, as an increase or decrease of 3 nucleotides in the distance between the Inr and DPE causes a seven- to eightfold reduction in transcription as well as a significant reduction in the binding of purified TFIID."[3]
Consensus sequences
The early DPE consensus sequence was RGWCGTG.[5][7]
The DPE consensus sequence is the more general sequence RGWYVT, or (A/G)G(A/T)(C/T)(A/C/G)T.[2]
The DPE in "the ATP‐binding cassette subfamily G member 2 gene in the marine pufferfish Takifugu rubripes" is 5'-AGTCTC-3'.[8]
DPE-containing promoters
"The ... Drosophila Antennapedia P2 (Antp P2) [promoter contains] a 7-nucleotide sequence that conforms to the DPE consensus"[3]. GeneID: 40835 Antp Antennapedia [Drosophila melanogaster] is also known as Antp P2.[9] GeneID: 3204 HOXA7 homeobox A7 [ Homo sapiens ] is also known as ANTP and "[t]his gene is highly similar to the antennapedia (Antp) gene of Drosophila."[10] As GeneID: 3204 is " highly similar to the antennapedia (Antp) gene of Drosophila"[10], it may have a DPE like the Drosophila gene core promoter does.
"[T]he TATA-less Drosophila Abdominal-B (Abd-B) promoter [has a] partial DPE sequence"[3]. GeneID: 3205 HOXA9 homeobox A9 [ Homo sapiens] is also known as ABD-B and "[t]his gene is highly similar to the abdominal-B (Abd-B) gene of Drosophila."[11] GeneID: 3205 may also be TATA-less and have a DPE.
General transcription factor II Ds
The DPE "is required for the binding of purified [general transcription factor II D] TFIID to a subset of TATA-less promoters"[4].
"Photo-cross-linking analysis of purified TFIID with a TATA-less DPE-containing promoter revealed specific cross-linking of dTAFII60 [TAF6 GeneID: 6878] and dTAFII40 [TAF11 GeneID: 6882] to the DPE, with a higher efficiency of cross-linking to dTAFII60 than to dTAFII40. These data, combined with the previously well-characterized interactions between the two TAFs and their homology to histones H4 and H3, suggest that a dTAFII60–dTAFII40 heterotetramer binds to the DPE."[3]
Hypotheses
- The DPE is not used to transcribe A1BG.
DPE (Juven-Gershon) samplings
For the Basic programs (starting with SuccessablesDPE.bas) written to compare nucleotide sequences with the sequences on either the template strand (-), or coding strand (+), of the DNA, in the negative direction (-), or the positive direction (+), the programs are expanded in the positive direction from 958 to 4445, are looking for, and found:
- Negative strand, negative direction: 63, AGTCCT at 4437, AGATGT at 4213, AGTTCT at 4179, GGTCCT at 4171, AGTCCT at 4139, GGACAT at 4122, AGATGT at 4063, AGTTCT at 4028, GGTTCT at 4020, GGTTGT at 3980, GGACAT at 3971, GGACCT at 3907, AGACCT at 3836, GGACCT at 3745, GGTCGT at 3732, GGTTCT at 3274, GGTCCT at 3250, AGTCCT at 3218, GGTTGT at 3138, AGTCCT at 3111, GGTCGT at 3071, GGACAT at 3062, AGATGT at 2989, GGTTAT at 2849, GGACAT at 2673, GGTTGT at 2611, AGTCCT at 2588, GGTTGT at 2548, GGACAT at 2539, AGTTAT at 2497, GGACAT at 2338, GGACCT at 2269, AGTCCT at 2251, GGTCAT at 2212, GGTTGT at 2149, AGTCCT at 2135, GGACAT at 1912, GGTCGT at 1786, AGACAT at 1777, GGTCGT at 1612, AGATAT at 1526, AGTCCT at 1276, GGACAT at 1259, AGATGT at 1225, GGTTGT at 1204, GGTCGT at 1141, GGACAT at 1132, AGTCCT at 985, GGACAT at 968, GGTTCT at 875, GGTCCT at 851, GGACAT at 802, AGTCCT at 715, GGTCGT at 677, GGACAT at 668, AGTCCT at 579, GGTTCT at 557, GGTCGT at 541, AGATGT at 482, AGTCCT at 442, GGTTCT at 420, GGTCGT at 404, GGACAT at 395.
- Negative strand, positive direction: 18, GGTTCT at 4074, AGTCCT at 3864, GGATGT at 3575, AGACCT at 3551, AGTTAT at 3382, GGTTGT at 3051, GGTTAT at 3025, AGTCCT at 2999, AGTTCT at 2955, GGTTCT at 2923, AGACCT at 2862, GGATGT at 2715, GGATAT at 2660, AGTTCT at 1988, GGACAT at 1870, AGTCCT at 758, GGTCCT at 219, GGACCT at 38.
- Positive strand, negative direction: 16, AGACAT at 4508, AGTTCT at 4418, AGACGT at 4236, AGATGT at 3621, AGATAT at 3466, AGACAT at 3434, AGATAT at 2982, AGACAT at 2949, AGACAT at 2881, AGATAT at 1596, AGACAT at 1570, GGATGT at 785, AGATGT at 245, AGACAT at 171, GGATAT at 109, GGATAT at 75.
- Positive strand, positive direction: 31, AGATCT at 4065, AGTCGT at 4024, AGTCCT at 3869, GGTCGT at 3721, GGTTGT at 3634, AGTTAT at 3425, GGACCT at 3363, AGACGT at 3279, AGACGT at 3268, AGTCGT at 3156, AGACGT at 3061, AGTCGT at 3042, AGACGT at 2857, AGTCCT at 2621, AGTCGT at 2199, AGTCGT at 2103, GGACGT at 1470, GGTCGT at 1458, GGACGT at 1370, GGTCGT at 1358, GGACGT at 1119, GGTCCT at 708, GGTCGT at 618, GGACCT at 599, GGACGT at 436, GGTCCT at 425, AGACCT at 271, AGACGT at 224, GGACGT at 192, GGTTCT at 178, GGACCT at 41.
- complement, negative strand, negative direction is SuccessablesDPEJc--.bas, looking for 5'-(C/T)C(A/T)(A/G)(C/G/T)A-3', 16, 5'-CCTATA-3' at 75, 5'-CCTATA-3' at 109, 5'-TCTGTA-3' at 171, 5'-TCTACA-3' at 245, 5'-CCTACA-3' at 785, 5'-TCTGTA-3' at 1570, 5'-TCTATA-3' at 1596, 5'-TCTGTA-3' at 2881, 5'-TCTGTA-3' at 2949, 5'-TCTATA-3' at 2982, 5'-TCTGTA-3' at 3434, 5'-TCTATA-3' at 3466, 5'-TCTACA-3' at 3621, 5'-TCTGCA-3' at 4236, 5'-TCAAGA-3' at 4418, 5'-TCTGTA-3' at 4508.
- complement, negative strand, positive direction is SuccessablesDPEJc-+.bas, looking for 5'-(C/T)C(A/T)(A/G)(C/G/T)A-3', 31, 5'-CCTGGA-3' at 41, 5'-CCAAGA-3' at 178, 5'-CCTGCA-3' at 192, 5'-TCTGCA-3' at 224, 5'-TCTGGA-3' at 271, 5'-CCAGGA-3' at 425, 5'-CCTGCA-3' at 436, 5'-CCTGGA-3' at 599, 5'-CCAGCA-3' at 618, 5'-CCAGGA-3' at 708, 5'-CCTGCA-3' at 1119, 5'-CCAGCA-3' at 1358, 5'-CCTGCA-3' at 1370, 5'-CCAGCA-3' at 1458, 5'-CCTGCA-3' at 1470, 5'-TCAGCA-3' at 2103, 5'-TCAGCA-3' at 2199, 5'-TCAGGA-3' at 2621, 5'-TCTGCA-3' at 2857, 5'-TCAGCA-3' at 3042, 5'-TCTGCA-3' at 3061, 5'-TCAGCA-3' at 3156, 5'-TCTGCA-3' at 3268, 5'-TCTGCA-3' at 3279, 5'-CCTGGA-3' at 3363, 5'-TCAATA-3' at 3425, 5'-CCAACA-3' at 3634, 5'-CCAGCA-3' at 3721, 5'-TCAGGA-3' at 3869, 5'-TCAGCA-3' at 4024, 5'-TCTAGA-3' at 4065.
- complement, positive strand, negative direction is SuccessablesDPEJc+-.bas, looking for 5'-(C/T)C(A/T)(A/G)(C/G/T)A-3', 63, 5'-CCTGTA-3' at 395, 5'-CCAGCA-3' at 404, 5'-CCAAGA-3' at 420, 5'-TCAGGA-3' at 442, 5'-TCTACA-3' at 482, 5'-CCAGCA-3' at 541, 5'-CCAAGA-3' at 557, 5'-TCAGGA-3' at 579, 5'-CCTGTA-3' at 668, 5'-CCAGCA-3' at 677, 5'-TCAGGA-3' at 715, 5'-CCTGTA-3' at 802, 5'-CCAGGA-3' at 851, 5'-CCAAGA-3' at 875, 5'-CCTGTA-3' at 968, 5'-TCAGGA-3' at 985, 5'-CCTGTA-3' at 1132, 5'-CCAGCA-3' at 1141, 5'-CCAACA-3' at 1204, 5'-TCTACA-3' at 1225, 5'-CCTGTA-3' at 1259, 5'-TCAGGA-3' at 1276, 5'-TCTATA-3' at 1526, 5'-CCAGCA-3' at 1612, 5'-TCTGTA-3' at 1777, 5'-CCAGCA-3' at 1786, 5'-CCTGTA-3' at 1912, 5'-TCAGGA-3' at 2135, 5'-CCAACA-3' at 2149, 5'-CCAGTA-3' at 2212, 5'-TCAGGA-3' at 2251, 5'-CCTGGA-3' at 2269, 5'-CCTGTA-3' at 2338, 5'-TCAATA-3' at 2497, 5'-CCTGTA-3' at 2539, 5'-CCAACA-3' at 2548, 5'-TCAGGA-3' at 2588, 5'-CCAACA-3' at 2611, 5'-CCTGTA-3' at 2673, 5'-CCAATA-3' at 2849, 5'-TCTACA-3' at 2989, 5'-CCTGTA-3' at 3062, 5'-CCAGCA-3' at 3071, 5'-TCAGGA-3' at 3111, 5'-CCAACA-3' at 3138, 5'-TCAGGA-3' at 3218, 5'-CCAGGA-3' at 3250, 5'-CCAAGA-3' at 3274, 5'-CCAGCA-3' at 3732, 5'-CCTGGA-3' at 3745, 5'-TCTGGA-3' at 3836, 5'-CCTGGA-3' at 3907, 5'-CCTGTA-3' at 3971, 5'-CCAACA-3' at 3980, 5'-CCAAGA-3' at 4020, 5'-TCAAGA-3' at 4028, 5'-TCTACA-3' at 4063, 5'-CCTGTA-3' at 4122, 5'-TCAGGA-3' at 4139, 5'-CCAGGA-3' at 4171, 5'-TCAAGA-3' at 4179, 5'-TCTACA-3' at 4213, 5'-TCAGGA-3' at 4437.
- complement, positive strand, positive direction is SuccessablesDPEJc++.bas, looking for 5'-(C/T)C(A/T)(A/G)(C/G/T)A-3', 18, 5'-CCTGGA-3' at 38, 5'-CCAGGA-3' at 219, 5'-TCAGGA-3' at 758, 5'-CCTGTA-3' at 1870, 5'-TCAAGA-3' at 1988, 5'-CCTATA-3' at 2660, 5'-CCTACA-3' at 2715, 5'-TCTGGA-3' at 2862, 5'-CCAAGA-3' at 2923, 5'-TCAAGA-3' at 2955, 5'-TCAGGA-3' at 2999, 5'-CCAATA-3' at 3025, 5'-CCAACA-3' at 3051, 5'-TCAATA-3' at 3382, 5'-TCTGGA-3' at 3551, 5'-CCTACA-3' at 3575, 5'-TCAGGA-3' at 3864, 5'-CCAAGA-3' at 4074.
- inverse complement, negative strand, negative direction: 18, AGGACC at 4546, ACAACC at 3942, AGGACC at 3906, ACGACC at 3864, AGAACC at 3793, AGGTCC at 3585, ATGACT at 3542, ATAACC at 3529, ACGTCT at 3431, ATATCT at 2903, ACGACC at 2326, ACAACT at 1853, AGGACC at 1841, AGAACC at 1649, ATGTCT at 1567, ACATCT at 970, AGGACC at 596, ACATCT at 284.
- inverse complement, negative strand, positive direction: 16, ATGTCC at 4367, AGAACT at 4048, ATGACC at 3784, AGGTCT at 3771, ATGTCC at 3577, ACGTCT at 3256, ATGACT at 3029, AGGTCT at 3019, AGAACC at 2776, AGGTCT at 2258, AGAACC at 2225, AGAACT at 1951, AGAACC at 1811, AGATCC at 965, AGATCC at 865, AGGTCC at 218.
- inverse complement, positive strand, negative direction: 12, AGATCC at 4476, AGAACC at 4451, ATGTCT at 3833, AGGACT at 3640, ATGTCT at 2986, ACAACC at 2844, ATGACT at 2786, ATGACC at 2189, ACGTCT at 1774, ACATCC at 1572, ATATCC at 1529, AGATCC at 973.
- inverse complement, positive strand, positive direction: 39, AGGACC at 4409, ACGTCT at 4317, AGGACT at 4186, ACGACC at 4177, ATAACT at 4161, AGAACT at 4131, AGATCC at 4077, AGATCT at 4065, AGGTCC at 4032, AGGTCT at 3891, ACGTCT at 3831, AGGTCT at 3806, AGGTCC at 3687, ACGTCC at 3466, AGGACC at 3296, ATGACC at 3117, AGGTCC at 3111, ACGTCT at 2859, ACAACC at 2816, ACGTCC at 2745, ACGTCT at 2721, ACGTCC at 2683, ATATCC at 2550, AGGACC at 2501, ACATCC at 2255, AGGACT at 2211, ACAACC at 2185, ACGTCT at 1937, ACATCC at 1875, ACGTCC at 1788, ACGACC at 1779, ACGACC at 1736, ATGACT at 1286, ACGTCC at 658, ACGTCT at 438, ACGTCC at 194, AGGTCC at 33, AGGTCT at 15, AGGTCC at 8.
- inverse, negative strand, negative direction, is SuccessablesDPEJi--.bas, looking for 5'-T(A/C/G)(C/T)(A/T)G(A/G)-3', 12, 5'-TCTAGG-3' at 973, 5'-TATAGG-3' at 1529, 5'-TGTAGG-3' at 1572, 5'-TGCAGA-3' at 1774, 5'-TACTGG-3' at 2189, 5'-TACTGA-3' at 2786, 5'-TGTTGG-3' at 2844, 5'-TACAGA-3' at 2986, 5'-TCCTGA-3' at 3640, 5'-TACAGA-3' at 3833, 5'-TCTTGG-3' at 4451, 5'-TCTAGG-3' at 4476.
- inverse, negative strand, positive direction, is SuccessablesDPEJi-+.bas, looking for 5'-T(A/C/G)(C/T)(A/T)G(A/G)-3', 39, 5'-TCCAGG-3' at 8, 5'-TCCAGA-3' at 15, 5'-TCCAGG-3' at 33, 5'-TGCAGG-3' at 194, 5'-TGCAGA-3' at 438, 5'-TGCAGG-3' at 658, 5'-TACTGA-3' at 1286, 5'-TGCTGG-3' at 1736, 5'-TGCTGG-3' at 1779, 5'-TGCAGG-3' at 1788, 5'-TGTAGG-3' at 1875, 5'-TGCAGA-3' at 1937, 5'-TGTTGG-3' at 2185, 5'-TCCTGA-3' at 2211, 5'-TGTAGG-3' at 2255, 5'-TCCTGG-3' at 2501, 5'-TATAGG-3' at 2550, 5'-TGCAGG-3' at 2683, 5'-TGCAGA-3' at 2721, 5'-TGCAGG-3' at 2745, 5'-TGTTGG-3' at 2816, 5'-TGCAGA-3' at 2859, 5'-TCCAGG-3' at 3111, 5'-TACTGG-3' at 3117, 5'-TCCTGG-3' at 3296, 5'-TGCAGG-3' at 3466, 5'-TCCAGG-3' at 3687, 5'-TCCAGA-3' at 3806, 5'-TGCAGA-3' at 3831, 5'-TCCAGA-3' at 3891, 5'-TCCAGG-3' at 4032, 5'-TCTAGA-3' at 4065, 5'-TCTAGG-3' at 4077, 5'-TCTTGA-3' at 4131, 5'-TATTGA-3' at 4161, 5'-TGCTGG-3' at 4177, 5'-TCCTGA-3' at 4186, 5'-TGCAGA-3' at 4317, 5'-TCCTGG-3' at 4409.
- inverse, positive strand, negative direction, is SuccessablesDPEJi+-.bas, looking for 5'-T(A/C/G)(C/T)(A/T)G(A/G)-3', 18, 5'-TGTAGA-3' at 284, 5'-TCCTGG-3' at 596, 5'-TGTAGA-3' at 970, 5'-TACAGA-3' at 1567, 5'-TCTTGG-3' at 1649, 5'-TCCTGG-3' at 1841, 5'-TGTTGA-3' at 1853, 5'-TGCTGG-3' at 2326, 5'-TATAGA-3' at 2903, 5'-TGCAGA-3' at 3431, 5'-TATTGG-3' at 3529, 5'-TACTGA-3' at 3542, 5'-TCCAGG-3' at 3585, 5'-TCTTGG-3' at 3793, 5'-TGCTGG-3' at 3864, 5'-TCCTGG-3' at 3906, 5'-TGTTGG-3' at 3942, 5'-TCCTGG-3' at 4546.
- inverse, positive strand, positive direction, is SuccessablesDPEJi++.bas, looking for 5'-T(A/C/G)(C/T)(A/T)G(A/G)-3', 16, 5'-TCCAGG-3' at 218, 5'-TCTAGG-3' at 865, 5'-TCTAGG-3' at 965, 5'-TCTTGG-3' at 1811, 5'-TCTTGA-3' at 1951, 5'-TCTTGG-3' at 2225, 5'-TCCAGA-3' at 2258, 5'-TCTTGG-3' at 2776, 5'-TCCAGA-3' at 3019, 5'-TACTGA-3' at 3029, 5'-TGCAGA-3' at 3256, 5'-TACAGG-3' at 3577, 5'-TCCAGA-3' at 3771, 5'-TACTGG-3' at 3784, 5'-TCTTGA-3' at 4048, 5'-TACAGG-3' at 4367.
DPE (Juven-Gershon) UTRs
Negative strand, negative direction: AGGACC at 4546, AGTCCT at 4437, AGATGT at 4213, AGTTCT at 4179, GGTCCT at 4171, AGTCCT at 4139, GGACAT at 4122, AGATGT at 4063, AGTTCT at 4028, GGTTCT at 4020, GGTTGT at 3980, GGACAT at 3971, ACAACC at 3942, GGACCT at 3907, AGGACC at 3906, ACGACC at 3864, AGACCT at 3836, AGAACC at 3793, GGACCT at 3745, GGTCGT at 3732, AGGTCC at 3585, ATGACT at 3542, ATAACC at 3529, ACGTCT at 3431, GGTTCT at 3274, GGTCCT at 3250, AGTCCT at 3218, GGTTGT at 3138, AGTCCT at 3111, GGTCGT at 3071, GGACAT at 3062, AGATGT at 2989, ATATCT at 2903, GGTTAT at 2849.
Positive strand, negative direction: AGACAT at 4508, AGATCC at 4476, AGAACC at 4451, AGTTCT at 4418, AGACGT at 4236, AGATGT at 3621, AGATAT at 3466, AGACAT at 3434, ATGTCT at 3833, AGGACT at 3640, ATGTCT at 2986, AGATAT at 2982, AGACAT at 2949, AGACAT at 2881.
DPE (Juven-Gershon) core promoters
Positive strand, negative direction: ACAACC at 2844.
Negative strand, positive direction: ATGTCC at 4367.
Positive strand, positive direction: AGGACC at 4409, ACGTCT at 4317.
DPE (Juven-Gershon) proximal promoters
Negative strand, negative direction: GGACAT at 2673, GGTTGT at 2611.
Positive strand, negative direction: ATGACT at 2786.
Negative strand, positive direction: GGTTCT at 4074.
Positive strand, positive direction: AGGACT at 4186, ACGACC at 4177, ATAACT at 4161, AGAACT at 4131, AGATCC at 4077, AGATCT at 4065.
DPE (Juven-Gershon) distal promoters
Negative strand, negative direction: AGTCCT at 2588, GGTTGT at 2548, GGACAT at 2539, AGTTAT at 2497, GGACAT at 2338, ACGACC at 2326, GGACCT at 2269, AGTCCT at 2251, GGTCAT at 2212, GGTTGT at 2149, AGTCCT at 2135, GGACAT at 1912, ACAACT at 1853, AGGACC at 1841, GGTCGT at 1786, AGACAT at 1777, AGAACC at 1649, GGTCGT at 1612, ATGTCT at 1567, AGATAT at 1526, AGTCCT at 1276, GGACAT at 1259, AGATGT at 1225, GGTTGT at 1204, GGTCGT at 1141, GGACAT at 1132, AGTCCT at 985, ACATCT at 970, GGACAT at 968, GGTTCT at 875, GGTCCT at 851, GGACAT at 802, AGTCCT at 715, GGTCGT at 677, GGACAT at 668, AGGACC at 596, AGTCCT at 579, GGTTCT at 557, GGTCGT at 541, AGATGT at 482, AGTCCT at 442, GGTTCT at 420, GGTCGT at 404, GGACAT at 395, ACATCT at 284.
Positive strand, negative direction: ATGACC at 2189, ACGTCT at 1774, AGATAT at 1596, ACATCC at 1572, AGACAT at 1570, ATATCC at 1529, AGATCC at 973, GGATGT at 785, AGATGT at 245, AGACAT at 171, GGATAT at 109, GGATAT at 75.
Positive direction
Negative strand, positive direction: AGAACT at 4048, AGTCCT at 3864, ATGACC at 3784, AGGTCT at 3771, ATGTCC at 3577, GGATGT at 3575, AGACCT at 3551, AGTTAT at 3382, ACGTCT at 3256, GGTTGT at 3051, ATGACT at 3029, GGTTAT at 3025, AGGTCT at 3019, AGTCCT at 2999, AGTTCT at 2955, GGTTCT at 2923, AGACCT at 2862, AGAACC at 2776, GGATGT at 2715, GGATAT at 2660, AGGTCT at 2258, AGAACC at 2225, AGTTCT at 1988, AGAACT at 1951, GGACAT at 1870, AGAACC at 1811, AGATCC at 965, AGATCC at 865, AGTCCT at 758, GGTCCT at 219, AGGTCC at 218, GGACCT at 38.
Positive strand, positive direction: AGGTCC at 4032, AGTCGT at 4024, AGGTCT at 3891, AGTCCT at 3869, ACGTCT at 3831, AGGTCT at 3806, GGTCGT at 3721, GGTTGT at 3634, AGGTCC at 3687, ACGTCC at 3466, AGTTAT at 3425, GGACCT at 3363, AGGACC at 3296, AGACGT at 3279, AGACGT at 3268, AGTCGT at 3156, ATGACC at 3117, AGGTCC at 3111, AGACGT at 3061, AGTCGT at 3042, ACGTCT at 2859, AGACGT at 2857, ACAACC at 2816, ACGTCC at 2745, ACGTCT at 2721, ACGTCC at 2683, AGTCCT at 2621, ATATCC at 2550, AGGACC at 2501, ACATCC at 2255, AGGACT at 2211, AGTCGT at 2199, ACAACC at 2185, AGTCGT at 2103, ACGTCT at 1937, ACATCC at 1875, ACGTCC at 1788, ACGACC at 1779, ACGACC at 1736, GGACGT at 1470, GGTCGT at 1458, GGACGT at 1370, GGTCGT at 1358, ATGACT at 1286, GGACGT at 1119, GGTCCT at 708, ACGTCC at 658, GGTCGT at 618, GGACCT at 599, ACGTCT at 438, GGACGT at 436, GGTCCT at 425, AGACCT at 271, AGACGT at 224, ACGTCC at 194, GGACGT at 192, GGTTCT at 178, GGACCT at 41, AGGTCC at 33, AGGTCT at 15, AGGTCC at 8.
DPE (Juven-Gershon) random dataset samplings
- DPEJGr0: 23, GGACCT at 4315, AGTCAT at 4170, AGACCT at 4049, GGACAT at 3848, AGTCGT at 3812, GGTCCT at 3613, GGTCGT at 3560, GGTTGT at 3018, GGTTGT at 2933, AGACGT at 2537, AGTTAT at 2140, GGACCT at 2104, GGTTCT at 2049, GGTCGT at 1753, AGTTGT at 1668, AGATAT at 1538, AGATAT at 1466, GGATCT at 925, GGATGT at 763, GGATCT at 737, GGTTGT at 723, GGTCAT at 700, AGTTAT at 558.
- DPEJGr1: 25, AGACGT at 4489, GGTCGT at 4203, AGACAT at 4197, GGTTAT at 4123, AGATCT at 3861, GGTCCT at 3643, GGACAT at 3142, GGTTCT at 2873, GGTTCT at 2845, GGACAT at 2618, GGTCAT at 2238, GGTCGT at 2012, AGACCT at 1950, AGTCCT at 1718, AGTCGT at 1653, GGACGT at 1511, GGTTCT at 1500, GGTTAT at 1472, GGATGT at 1449, GGACAT at 1342, GGACGT at 1184, AGACCT at 759, AGATGT at 401, AGATGT at 346, GGACAT at 89.
- DPEJGr2: 20, AGACAT at 4045, GGACAT at 3250, AGTCGT at 2873, GGTTCT at 2738, GGACCT at 2405, AGATAT at 2114, AGACGT at 1861, GGACAT at 1777, AGATGT at 1648, GGACGT at 1603, AGACCT at 1507, GGTCCT at 1401, AGTTGT at 1036, AGTTCT at 879, AGACCT at 683, GGACAT at 495, GGACAT at 218, AGTCAT at 200, GGTTCT at 179, GGATCT at 158.
- DPEJGr3: 32, GGTTGT at 4513, GGTTAT at 4414, AGTTCT at 4219, AGTTCT at 3988, GGACAT at 3885, AGTCCT at 3028, GGTCAT at 2855, AGTCGT at 2687, GGACCT at 2652, GGATCT at 2567, AGTCCT at 2550, GGATCT at 2435, GGTCAT at 2304, AGTTGT at 2264, GGATGT at 1791, AGATGT at 1715, GGTTCT at 1683, GGTCAT at 1632, AGTTCT at 1539, GGTTGT at 1193, GGACAT at 1167, AGTTCT at 1019, AGATAT at 941, GGTCAT at 748, AGACAT at 661, GGACGT at 629, GGTTGT at 529, AGTCAT at 478, GGTCCT at 348, GGTTCT at 329, GGTTAT at 258, GGACCT at 76.
- DPEJGr4: 24, GGTTCT at 4460, GGATGT at 4071, AGTCCT at 3971, GGATCT at 3923, GGTTGT at 3530, AGACAT at 3368, AGTCCT at 3317, AGATCT at 3133, GGTCCT at 2667, GGTTGT at 2435, GGACGT at 2313, GGTTAT at 2293, AGTCAT at 2249, GGTTGT at 2161, AGTTGT at 2007, AGATCT at 1670, GGACGT at 1605, GGATCT at 1495, AGACGT at 1452, GGTCCT at 799, GGATGT at 728, GGTCGT at 358, GGACGT at 291, AGACCT at 260.
- DPEJGr5: 17, GGACCT at 4345, AGTCAT at 4250, GGTCAT at 4052, GGTTCT at 3378, GGTTAT at 3233, GGTCGT at 3121, AGTTCT at 3089, AGTCGT at 2940, AGATCT at 2830, GGTTCT at 2557, GGATCT at 2429, AGTCGT at 2195, AGACGT at 1586, GGACCT at 1383, AGACAT at 1257, AGACGT at 711, GGTTCT at 210.
- DPEJGr6: 22, GGTCCT at 4465, GGTTCT at 4117, AGTCCT at 4012, AGTCGT at 3837, GGACAT at 3657, AGTCGT at 3489, GGACCT at 3416, AGTTCT at 3091, AGATCT at 2937, AGTTAT at 2806, GGATGT at 2708, AGTTGT at 2639, AGATCT at 2391, GGACCT at 2016, GGTTCT at 1515, AGATAT at 1242, AGTCAT at 1110, AGACCT at 855, GGACAT at 792, GGACGT at 609, GGTTGT at 586, GGTCGT at 157.
- DPEJGr7: 38, GGATAT at 4531, GGACAT at 4378, AGTCGT at 4300, GGATGT at 4170, GGATGT at 4100, AGTTAT at 4081, AGACGT at 4032, GGTTGT at 3908, AGTTAT at 3773, AGTCCT at 3577, GGACGT at 3402, GGATCT at 3368, GGACAT at 3168, AGATAT at 3054, AGTTGT at 2938, GGACGT at 2805, GGTCGT at 2781, AGTTCT at 2572, GGATCT at 2461, AGTCGT at 2263, GGTTGT at 2138, AGTCCT at 2097, GGTCCT at 2052, AGTCCT at 2013, GGTTGT at 1883, AGTTAT at 1806, GGACAT at 1648, GGTTGT at 1310, AGTCAT at 1015, GGTCAT at 973, GGACAT at 798, GGTTCT at 735, GGACGT at 696, GGTCAT at 635, GGATGT at 315, GGTCGT at 264, AGTCAT at 203, GGATGT at 117.
- DPEJGr8: 28, GGTCCT at 4350, AGACCT at 4264, AGACGT at 4065, GGACAT at 4027, AGATAT at 3980, GGTCGT at 3844, GGACGT at 3601, GGATAT at 3555, AGACGT at 3542, AGACAT at 3218, GGACCT at 3008, AGATAT at 2999, GGTCCT at 2877, AGTCCT at 2804, AGATGT at 2727, GGTTGT at 2380, AGTCCT at 2208, GGATAT at 1701, GGTCCT at 1457, GGTTAT at 1335, GGTCCT at 1196, AGACAT at 1155, AGTCCT at 1013, GGTTAT at 552, AGATGT at 522, GGTCGT at 247, GGTTAT at 92, AGATCT at 27.
- DPEJGr9: 18, GGTTCT at 4465, AGATGT at 4418, GGATGT at 4388, AGTTCT at 4286, GGATAT at 4226, GGTTAT at 4176, AGATAT at 4020, AGACAT at 3639, GGACGT at 3509, AGATAT at 2994, AGACAT at 2918, GGTTAT at 2893, GGTCGT at 2613, GGACCT at 2334, GGACAT at 1846, AGACAT at 1631, GGATGT at 1468, GGTCCT at 1459.
- DPEJGr0ci: 23, ACATCC at 4416, AGGACC at 4314, AGAACC at 4308, AGAACC at 3587, AGAACT at 3494, ATGTCC at 3377, AGAACC at 3322, AGGACC at 2801, AGATCC at 2762, AGGTCC at 2747, ATGACC at 2730, ACGACC at 2628, ACGTCC at 2594, ACGTCT at 2539, AGATCC at 2250, ATGTCT at 2115, ATAACT at 1792, AGGTCT at 1047, AGGACC at 942, ATGACT at 704, AGGACC at 678, ATAACT at 431, AGAACC at 55.
- DPEJGr1ci: 31, AGGACC at 4466, ACAACC at 4321, ACATCT at 4097, AGATCT at 3861, ATGACC at 3635, ATAACC at 3475, ACGTCC at 3430, ACGTCC at 3300, AGGACT at 3277, ACGACT at 3246, AGGACC at 3116, AGATCC at 2835, ACAACT at 2751, AGGTCC at 2560, AGGACC at 2461, ATAACT at 2157, AGGACC at 2122, ACGACT at 1828, AGATCC at 1799, ACGACC at 1762, ACGACC at 1736, ATGACT at 1635, ACATCC at 1203, ACGTCC at 1186, ACATCT at 1130, AGAACT at 1035, AGGACT at 647, ACGACT at 618, AGAACT at 471, ATATCT at 59, ATGTCT at 24.
- DPEJGr2ci: 17, ACAACC at 3839, ATAACC at 3824, ATGACT at 3809, ACATCC at 3179, ACATCC at 2810, AGGTCC at 2613, ATATCT at 2415, AGGACT at 2333, ACGACC at 2317, ACAACC at 2193, AGGACT at 1676, ACGTCT at 1661, ACAACT at 1309, ATAACC at 1140, ATGTCT at 920, ATGACT at 373, ACATCC at 220.
- DPEJGr3ci: 22, AGAACT at 4531, ATGTCT at 4388, ATAACC at 3903, AGATCC at 3825, ATAACC at 3674, ATAACC at 3447, ATAACC at 3339, AGGACC at 3318, ACATCT at 3285, ATGTCC at 3188, ACGTCT at 2585, AGAACC at 2243, AGAACC at 1301, ATAACT at 1290, AGGACC at 1274, ATATCT at 943, AGAACC at 691, AGGTCT at 679, AGGACC at 342, AGGTCT at 246, AGAACT at 221, AGAACC at 188.
- DPEJGr4ci: 25, ATATCT at 4475, ACGTCC at 4367, ACGTCT at 4334, ACATCT at 4298, ACGACC at 4045, ACGACT at 3938, ACAACT at 3712, ACGTCT at 3509, AGATCT at 3133, AGATCC at 3048, AGAACT at 2998, AGGACT at 2956, ACAACC at 2856, ATAACT at 2378, ACGTCT at 2315, AGAACT at 1747, AGAACT at 1740, AGATCT at 1670, AGGTCT at 980, AGGTCC at 676, AGAACT at 579, ATAACT at 466, ACAACC at 421, AGAACT at 267, AGGACT at 9.
- DPEJGr5ci: 14, ACGTCC at 4403, AGATCC at 4031, AGAACC at 3986, AGAACC at 3394, AGGACT at 3365, AGATCT at 2830, ATAACC at 2093, AGAACC at 1948, ATAACT at 1873, AGAACC at 1430, ACGTCC at 905, AGAACC at 613, ATGACC at 505, AGAACT at 342.
- DPEJGr6ci: 25, AGAACC at 4404, AGGACT at 3989, ACGACC at 3784, ATGACC at 3192, ACGACC at 2954, AGATCT at 2937, ACGTCT at 2927, AGATCT at 2391, ACAACC at 2358, ATATCC at 2352, ATAACT at 2241, ATATCC at 2054, ATATCT at 2038, ACAACC at 1917, ACAACT at 1808, ATGTCT at 1291, AGGACC at 1225, ATAACT at 1133, ACATCT at 810, AGGACC at 780, ACGTCC at 611, AGGACC at 387, AGGACC at 374, AGGTCC at 306, ATGTCC at 46.
- DPEJGr7ci: 24, ACGTCC at 4259, ATGTCT at 4004, ACAACC at 3418, ACATCC at 3170, ACGTCC at 2860, ATAACT at 2828, AGGTCC at 2812, AGGACC at 2795, ACGTCT at 2705, AGGACC at 2688, AGGTCC at 2676, ACAACC at 2550, ACAACC at 2408, ACGACT at 2363, ATAACT at 2329, AGGACT at 2070, ACGTCC at 1576, ACGACC at 1522, ATAACC at 1447, ATGTCC at 1145, ACGTCC at 946, ACGTCT at 698, ATGACT at 393, ACAACC at 63.
- DPEJGr8ci: 23, AGGTCC at 4349, ACGACT at 4184, ACAACC at 4082, ACATCC at 4029, ACGACC at 3935, AGGACC at 3647, ACAACC at 3233, ACATCT at 3220, AGAACC at 3043, ACAACC at 2604, ATAACT at 2144, ACAACC at 2105, AGGTCC at 2075, ACAACT at 1906, AGGACT at 1765, AGGTCT at 1517, AGGTCC at 1417, ATGACT at 1133, AGAACC at 1069, AGGTCC at 1023, AGAACT at 885, AGGTCT at 399, AGATCT at 27.
- DPEJGr9ci: 22, ATATCT at 4329, ATATCC at 4228, ATAACC at 4143, ACATCC at 3750, ATGACT at 3704, ATGACC at 3680, AGGACC at 3378, ATGTCT at 3298, ATATCC at 2996, AGGTCT at 2648, AGATCC at 2431, ACGACC at 2148, ATATCC at 2097, ATGACT at 2080, ACATCC at 2024, AGGACT at 1596, ACATCC at 1523, AGGACT at 1385, AGGTCC at 575, AGGTCT at 450, ACGACC at 288, ACAACT at 120.
DPEJGr UTRs
- DPEJGr0: GGACCT at 4315, AGTCAT at 4170, AGACCT at 4049, GGACAT at 3848, AGTCGT at 3812, GGTCCT at 3613, GGTCGT at 3560, GGTTGT at 3018, GGTTGT at 2933.
- DPEJGr2: AGACAT at 4045, GGACAT at 3250, AGTCGT at 2873.
- DPEJGr4: GGTTCT at 4460, GGATGT at 4071, AGTCCT at 3971, GGATCT at 3923, GGTTGT at 3530, AGACAT at 3368, AGTCCT at 3317, AGATCT at 3133.
- DPEJGr6: GGTCCT at 4465, GGTTCT at 4117, AGTCCT at 4012, AGTCGT at 3837, GGACAT at 3657, AGTCGT at 3489, GGACCT at 3416, AGTTCT at 3091, AGATCT at 2937.
- DPEJGr8: GGTCCT at 4350, AGACCT at 4264, AGACGT at 4065, GGACAT at 4027, AGATAT at 3980, GGTCGT at 3844, GGACGT at 3601, GGATAT at 3555, AGACGT at 3542, AGACAT at 3218, GGACCT at 3008, AGATAT at 2999, GGTCCT at 2877.
- DPEJGr0ci: ACATCC at 4416, AGGACC at 4314, AGAACC at 4308, AGAACC at 3587, AGAACT at 3494, ATGTCC at 3377, AGAACC at 3322.
- DPEJGr2ci: ACAACC at 3839, ATAACC at 3824, ATGACT at 3809, ACATCC at 3179.
- DPEJGr4ci: ATATCT at 4475, ACGTCC at 4367, ACGTCT at 4334, ACATCT at 4298, ACGACC at 4045, ACGACT at 3938, ACAACT at 3712, ACGTCT at 3509, AGATCT at 3133, AGATCC at 3048, AGAACT at 2998, AGGACT at 2956, ACAACC at 2856.
- DPEJGr6ci: AGAACC at 4404, AGGACT at 3989, ACGACC at 3784, ATGACC at 3192, ACGACC at 2954, AGATCT at 2937, ACGTCT at 2927.
- DPEJGr8ci: AGGTCC at 4349, ACGACT at 4184, ACAACC at 4082, ACATCC at 4029, ACGACC at 3935, AGGACC at 3647, ACAACC at 3233, ACATCT at 3220, AGAACC at 3043.
DPEJGr core promoters
- DPEJGr1: AGACGT at 4489.
- DPEJGr3: GGTTGT at 4513, GGTTAT at 4414.
- DPEJGr5: GGACCT at 4345.
- DPEJGr7: GGATAT at 4531, GGACAT at 4378, AGTCGT at 4300.
- DPEJGr9: GGTTCT at 4465, AGATGT at 4418, GGATGT at 4388, AGTTCT at 4286.
- DPEJGr1ci: AGGACC at 4466, ACAACC at 4321.
- DPEJGr3ci: AGAACT at 4531, ATGTCT at 4388.
- DPEJGr5ci: ACGTCC at 4403.
- DPEJGr9ci: ATATCT at 4329.
DPEJGr proximal promoters
- DPEJGr2: GGTTCT at 2738.
- DPEJGr4: GGTCCT at 2667.
- DPEJGr6: AGTTAT at 2806, GGATGT at 2708, AGTTGT at 2639.
- DPEJGr8: AGTCCT at 2804, AGATGT at 2727.
- DPEJGr0ci: AGGACC at 2801, AGATCC at 2762, AGGTCC at 2747, ATGACC at 2730, ACGACC at 2628.
- DPEJGr2ci: ACATCC at 2810, AGGTCC at 2613.
- DPEJGr8ci: ACAACC at 2604.
- DPEJGr1: GGTCGT at 4203, AGACAT at 4197, GGTTAT at 4123.
- DPEJGr3: 3AGTTCT at 4219.
- DPEJGr5: AGTCAT at 4250, GGTCAT at 4052.
- DPEJGr7: GGATGT at 4170, GGATGT at 4100, AGTTAT at 4081.
- DPEJGr9: GGATAT at 4226, GGTTAT at 4176.
- DPEJGr1ci: ACATCT at 4097.
- DPEJGr7ci: ACGTCC at 4259.
- DPEJGr9ci: ATATCC at 4228, ATAACC at 4143.
DPEJGr distal promoters
- DPEJGr0: AGACGT at 2537, AGTTAT at 2140, GGACCT at 2104, GGTTCT at 2049, GGTCGT at 1753, AGTTGT at 1668, AGATAT at 1538, AGATAT at 1466, GGATCT at 925, GGATGT at 763, GGATCT at 737, GGTTGT at 723, GGTCAT at 700, AGTTAT at 558.
- DPEJGr2: GGACCT at 2405, AGATAT at 2114, AGACGT at 1861, GGACAT at 1777, AGATGT at 1648, GGACGT at 1603, AGACCT at 1507, GGTCCT at 1401, AGTTGT at 1036, AGTTCT at 879, AGACCT at 683, GGACAT at 495, GGACAT at 218, AGTCAT at 200, GGTTCT at 179, GGATCT at 158.
- DPEJGr4: GGTTGT at 2435, GGACGT at 2313, GGTTAT at 2293, AGTCAT at 2249, GGTTGT at 2161, AGTTGT at 2007, AGATCT at 1670, GGACGT at 1605, GGATCT at 1495, AGACGT at 1452, GGTCCT at 799, GGATGT at 728, GGTCGT at 358, GGACGT at 291, AGACCT at 260.
- DPEJGr6: AGATCT at 2391, GGACCT at 2016, GGTTCT at 1515, AGATAT at 1242, AGTCAT at 1110, AGACCT at 855, GGACAT at 792, GGACGT at 609, GGTTGT at 586, GGTCGT at 157.
- DPEJGr8: GGTTGT at 2380, AGTCCT at 2208, GGATAT at 1701, GGTCCT at 1457, GGTTAT at 1335, GGTCCT at 1196, AGACAT at 1155, AGTCCT at 1013, GGTTAT at 552, AGATGT at 522, GGTCGT at 247, GGTTAT at 92, AGATCT at 27.
- DPEJGr0ci: ACGTCC at 2594, ACGTCT at 2539, AGATCC at 2250, ATGTCT at 2115, ATAACT at 1792, AGGTCT at 1047, AGGACC at 942, ATGACT at 704, AGGACC at 678, ATAACT at 431, AGAACC at 55.
- DPEJGr2ci: ATATCT at 2415, AGGACT at 2333, ACGACC at 2317, ACAACC at 2193, AGGACT at 1676, ACGTCT at 1661, ACAACT at 1309, ATAACC at 1140, ATGTCT at 920, ATGACT at 373, ACATCC at 220.
- DPEJGr4ci: ATAACT at 2378, ACGTCT at 2315, AGAACT at 1747, AGAACT at 1740, AGATCT at 1670, AGGTCT at 980, AGGTCC at 676, AGAACT at 579, ATAACT at 466, ACAACC at 421, AGAACT at 267, AGGACT at 9.
- DPEJGr6ci: AGATCT at 2391, ACAACC at 2358, ATATCC at 2352, ATAACT at 2241, ATATCC at 2054, ATATCT at 2038, ACAACC at 1917, ACAACT at 1808, ATGTCT at 1291, AGGACC at 1225, ATAACT at 1133, ACATCT at 810, AGGACC at 780, ACGTCC at 611, AGGACC at 387, AGGACC at 374, AGGTCC at 306, ATGTCC at 46.
- DPEJGr8ci: ATAACT at 2144, ACAACC at 2105, AGGTCC at 2075, ACAACT at 1906, AGGACT at 1765, AGGTCT at 1517, AGGTCC at 1417, ATGACT at 1133, AGAACC at 1069, AGGTCC at 1023, AGAACT at 885, AGGTCT at 399, AGATCT at 27.
Positive direction
- DPEJGr1: AGATCT at 3861, GGTCCT at 3643, GGACAT at 3142, GGTTCT at 2873, GGTTCT at 2845, GGACAT at 2618, GGTCAT at 2238, GGTCGT at 2012, AGACCT at 1950, AGTCCT at 1718, AGTCGT at 1653, GGACGT at 1511, GGTTCT at 1500, GGTTAT at 1472, GGATGT at 1449, GGACAT at 1342, GGACGT at 1184, AGACCT at 759, AGATGT at 401, AGATGT at 346, GGACAT at 89.
- DPEJGr3: AGTTCT at 3988, GGACAT at 3885, AGTCCT at 3028, GGTCAT at 2855, AGTCGT at 2687, GGACCT at 2652, GGATCT at 2567, AGTCCT at 2550, GGATCT at 2435, GGTCAT at 2304, AGTTGT at 2264, GGATGT at 1791, AGATGT at 1715, GGTTCT at 1683, GGTCAT at 1632, AGTTCT at 1539, GGTTGT at 1193, GGACAT at 1167, AGTTCT at 1019, AGATAT at 941, GGTCAT at 748, AGACAT at 661, GGACGT at 629, GGTTGT at 529, AGTCAT at 478, GGTCCT at 348, GGTTCT at 329, GGTTAT at 258, GGACCT at 76.
- DPEJGr5: GGTTCT at 3378, GGTTAT at 3233, GGTCGT at 3121, AGTTCT at 3089, AGTCGT at 2940, AGATCT at 2830, GGTTCT at 2557, GGATCT at 2429, AGTCGT at 2195, AGACGT at 1586, GGACCT at 1383, AGACAT at 1257, AGACGT at 711, GGTTCT at 210.
- DPEJGr7: AGACGT at 4032, GGTTGT at 3908, AGTTAT at 3773, AGTCCT at 3577, GGACGT at 3402, GGATCT at 3368, GGACAT at 3168, AGATAT at 3054, AGTTGT at 2938, GGACGT at 2805, GGTCGT at 2781, AGTTCT at 2572, GGATCT at 2461, AGTCGT at 2263, GGTTGT at 2138, AGTCCT at 2097, GGTCCT at 2052, AGTCCT at 2013, GGTTGT at 1883, AGTTAT at 1806, GGACAT at 1648, GGTTGT at 1310, AGTCAT at 1015, GGTCAT at 973, GGACAT at 798, GGTTCT at 735, GGACGT at 696, GGTCAT at 635, GGATGT at 315, GGTCGT at 264, AGTCAT at 203, GGATGT at 117.
- DPEJGr9: AGATAT at 4020, AGACAT at 3639, GGACGT at 3509, AGATAT at 2994, AGACAT at 2918, GGTTAT at 2893, GGTCGT at 2613, GGACCT at 2334, GGACAT at 1846, AGACAT at 1631, GGATGT at 1468, GGTCCT at 1459.
- DPEJGr1ci: AGATCT at 3861, ATGACC at 3635, ATAACC at 3475, ACGTCC at 3430, ACGTCC at 3300, AGGACT at 3277, ACGACT at 3246, AGGACC at 3116, AGATCC at 2835, ACAACT at 2751, AGGTCC at 2560, AGGACC at 2461, ATAACT at 2157, AGGACC at 2122, ACGACT at 1828, AGATCC at 1799, ACGACC at 1762, ACGACC at 1736, ATGACT at 1635, ACATCC at 1203, ACGTCC at 1186, ACATCT at 1130, AGAACT at 1035, AGGACT at 647, ACGACT at 618, AGAACT at 471, ATATCT at 59, ATGTCT at 24.
- DPEJGr3ci: ATAACC at 3903, AGATCC at 3825, ATAACC at 3674, ATAACC at 3447, ATAACC at 3339, AGGACC at 3318, ACATCT at 3285, ATGTCC at 3188, ACGTCT at 2585, AGAACC at 2243, AGAACC at 1301, ATAACT at 1290, AGGACC at 1274, ATATCT at 943, AGAACC at 691, AGGTCT at 679, AGGACC at 342, AGGTCT at 246, AGAACT at 221, AGAACC at 188.
- DPEJGr5ci: AGATCC at 4031, AGAACC at 3986, AGAACC at 3394, AGGACT at 3365, AGATCT at 2830, ATAACC at 2093, AGAACC at 1948, ATAACT at 1873, AGAACC at 1430, ACGTCC at 905, AGAACC at 613, ATGACC at 505, AGAACT at 342.
- DPEJGr7ci: ATGTCT at 4004, ACAACC at 3418, ACATCC at 3170, ACGTCC at 2860, ATAACT at 2828, AGGTCC at 2812, AGGACC at 2795, ACGTCT at 2705, AGGACC at 2688, AGGTCC at 2676, ACAACC at 2550, ACAACC at 2408, ACGACT at 2363, ATAACT at 2329, AGGACT at 2070, ACGTCC at 1576, ACGACC at 1522, ATAACC at 1447, ATGTCC at 1145, ACGTCC at 946, ACGTCT at 698, ATGACT at 393, ACAACC at 63.
- DPEJGr9ci: ACATCC at 3750, ATGACT at 3704, ATGACC at 3680, AGGACC at 3378, ATGTCT at 3298, ATATCC at 2996, AGGTCT at 2648, AGATCC at 2431, ACGACC at 2148, ATATCC at 2097, ATGACT at 2080, ACATCC at 2024, AGGACT at 1596, ACATCC at 1523, AGGACT at 1385, AGGTCC at 575, AGGTCT at 450, ACGACC at 288, ACAACT at 120.
DPE (Juven-Gershon) analysis and results
The reals have thirty-four consensus sequences on the negative strand in the negative direction in the UTR between ZSCAN22 and A1BG for an occurrence of 34.0. In the negative direction on the positive strand there are fourteen sequences for an occurrence of 14.0 and an average occurrence of 24.0.
The randoms had eighty-two for an occurrence of 8.2 which is way below the reals suggesting they are likely active or activable.
For the core promoters, the negative direction has one for an occurrence of 0.5. In the positive direction the reals have three for an occurrence of 1.5, an average of 1.0. For the randoms, there no core promoters in the arbitrary negative direction for an occurrence of 0.0. In the positive direction, the randoms had eleven core promoters for an occurrence of 1.1 for an average occurrence of 0.55. Each is sytematically lower than the reals suggesting that the reals are likely active or activable.
The real proximal promoters have three in the negative direction for an occurrence of 1.5. There are seven in the positive direction for an occurrence of 3.5. The randoms had fifteen in the arbitrary negative direction for an occurrence of 1.5. In the positive direction the randoms had fifteen for an occurrence of 1.5. The reals in the negative direction are likely random, whereas the reals in the positive direction are likely active or activable.
The real distal promoters have ninety-three in the positive direction for an occurrence of 46.5, and fifty-seven in the negative direction for an occurrence of 28.5.
The randoms had two hundred and eleven in the arbitrary positive direction for an occurrence of 21.1, and the negative direction had one hundred and thirty-three for an occurrence of 13.1. These are also systematically lower than the reals suggesting that the real distal occurrences are likely active or activable.
DPE (Kadonaga) samplings
Copying a responsive elements consensus sequence (A/G)G(A/T)CGTG and putting the sequence in "⌘F" finds two between ZNF497 and A1BG or seven between ZSCAN22 and A1BG as can be found by the computer programs.
For the Basic programs testing consensus sequence (A/G)G(A/T)CGTG (starting with SuccessablesDPEK.bas) written to compare nucleotide sequences with the sequences on either the template strand (-), or coding strand (+), of the DNA, in the negative direction (-), or the positive direction (+), the programs are, are looking for, and found:
- negative strand, negative direction, looking for (A/G)G(A/T)CGTG, 7, GGTCGTG at 3733, GGTCGTG at 3072, GGTCGTG at 1787, GGTCGTG at 1142, GGTCGTG at 678, GGTCGTG at 542, GGTCGTG at 405.
- positive strand, negative direction, looking for (A/G)G(A/T)CGTG, 1, AGACGTG at 4237.
- positive strand, positive direction, looking for (A/G)G(A/T)CGTG, 8, AGTCGTG at 3043, AGTCGTG at 2200, AGTCGTG at 2104, GGACGTG at 1471, GGTCGTG at 1459, GGACGTG at 1371, GGTCGTG at 1359, GGTCGTG at 619.
- negative strand, positive direction, looking for (A/G)G(A/T)CGTG, 0.
- complement, negative strand, negative direction, looking for (C/T)C(A/T)GCAC, 1, TCTGCAC at 4237.
- complement, positive strand, negative direction, looking for (C/T)C(A/T)GCAC, 7, CCAGCAC at 3733, CCAGCAC at 3072, CCAGCAC at 1787, CCAGCAC at 1142, CCAGCAC at 678, CCAGCAC at 542, CCAGCAC at 405.
- complement, positive strand, positive direction, looking for (C/T)C(A/T)GCAC, 0.
- complement, negative strand, positive direction, looking for (C/T)C(A/T)GCAC, 8, TCAGCAC at 3043, TCAGCAC at 2200, TCAGCAC at 2104, CCAGCAC at 1471, CCAGCAC at 1459, CCAGCAC at 1371, CCAGCAC at 1359, CCAGCAC at 619.
- inverse complement, negative strand, negative direction, looking for CACG(A/T)C(C/T), 1, CACGTCT at 3431.
- inverse complement, positive strand, negative direction, looking for CACG(A/T)C(C/T), 1, CACGTCT at 1774.
- inverse complement, positive strand, positive direction, looking for CACG(A/T)C(C/T), 3, CACGTCC at 3466, CACGTCC at 2683, CACGTCC at 1788.
- inverse complement, negative strand, positive direction, looking for CACG(A/T)C(C/T), 1, CACGTCT at 3256.
- inverse negative strand, negative direction, looking for GTGC(A/T)G(A/G), 1, GTGCAGA at 1774.
- inverse positive strand, negative direction, looking for GTGC(A/T)G(A/G), 1, GTGCAGA at 3431.
- inverse positive strand, positive direction, looking for GTGC(A/T)G(A/G), 1, GTGCAGA at 3256.
- inverse negative strand, positive direction, looking for GTGC(A/T)G(A/G), 3, GTGCAGG at 3466, GTGCAGG at 2683, GTGCAGG at 1788.
DPE (Kadonaga) UTRs
Negative strand, negative direction: GGTCGTG at 3733, CACGTCT at 3431, GGTCGTG at 3072.
Positive strand, negative direction: AGACGTG at 4237.
DPE (Kadonaga) distal promoters
Negative strand, negative direction: GGTCGTG at 1787, GGTCGTG at 1142, GGTCGTG at 678, GGTCGTG at 542, GGTCGTG at 405.
Positive strand, negative direction: CACGTCC at 3466, CACGTCC at 2683, CACGTCC at 1788, CACGTCT at 1774.
Positive strand, positive direction: AGTCGTG at 3043, AGTCGTG at 2200, AGTCGTG at 2104, GGACGTG at 1471, GGTCGTG at 1459, GGACGTG at 1371, GGTCGTG at 1359, GGTCGTG at 619.
Negative strand, positive direction: CACGTCT at 3256.
DPE (Kadonaga) random dataset samplings
- DPEKr0: 0.
- DPEKr1: 0.
- DPEKr2: 0.
- DPEKr3: 0.
- DPEKr4: 1, AGACGTG at 1453.
- DPEKr5: 2, AGACGTG at 1587, AGACGTG at 712.
- DPEKr6: 0.
- DPEKr7: 1, AGTCGTG at 4301.
- DPEKr8: 1, GGTCGTG at 3845.
- DPEKr9: 0.
- DPEKr0ci: 1, CACGTCC at 2594.
- DPEKr1ci: 4, CACGTCC at 3430, CACGTCC at 3300, CACGACT at 3246, CACGACC at 1762.
- DPEKr2ci: 0.
- DPEKr3ci: 0.
- DPEKr4ci: 2, CACGTCC at 4367, CACGTCT at 3509.
- DPEKr5ci: 0.
- DPEKr6ci: 1, CACGTCT at 2927.
- DPEKr7ci: 2, CACGACC at 1522, CACGTCC at 946.
- DPEKr8ci: 0.
- DPEKr9ci: 1, CACGACC at 2148.
DPEKr UTRs
- DPEKr4: AGACGTG at 1453.
- DPEKr8: GGTCGTG at 3845.
- DPEKr4ci: CACGTCC at 4367, CACGTCT at 3509.
- DPEKr6ci: CACGTCT at 2927.
DPEKr core promoters
- DPEKr7: AGTCGTG at 4301.
DPEKr distal promoters
- DPEKr0ci: CACGTCC at 2594.
- DPEKr5: AGACGTG at 1587, AGACGTG at 712.
- DPEKr1ci: CACGTCC at 3430, CACGTCC at 3300, CACGACT at 3246, CACGACC at 1762.
- DPEKr7ci: CACGACC at 1522, CACGTCC at 946.
- DPEKr9ci: CACGACC at 2148.
DPE (Kadonaga) analysis and results
The early DPE consensus sequence was RGWCGTG.[5][7]
Reals or randoms | Promoters | direction | Numbers | Strands | Occurrences | Averages (± 0.1) |
---|---|---|---|---|---|---|
Reals | UTR | negative | 4 | 2 | 2 | 2 ± 1 (--3,+-1) |
Randoms | UTR | arbitrary negative | 0 | 10 | 0 | 0 |
Randoms | UTR | alternate negative | 0 | 10 | 0 | 0 |
Reals | Core | negative | 0 | 2 | 0 | 0 |
Randoms | Core | arbitrary negative | 0 | 10 | 0 | 0 |
Randoms | Core | alternate negative | 0 | 10 | 0 | 0 |
Reals | Core | positive | 0 | 2 | 0 | 0 |
Randoms | Core | arbitrary positive | 0 | 10 | 0 | 0 |
Randoms | Core | alternate positive | 0 | 10 | 0 | 0 |
Reals | Proximal | negative | 0 | 2 | 0 | 0 |
Randoms | Proximal | arbitrary negative | 0 | 10 | 0 | 0 |
Randoms | Proximal | alternate negative | 0 | 10 | 0 | 0 |
Reals | Proximal | positive | 0 | 2 | 0 | 0 |
Randoms | Proximal | arbitrary positive | 0 | 10 | 0 | 0 |
Randoms | Proximal | alternate positive | 0 | 10 | 0 | 0 |
Reals | Distal | negative | 9 | 2 | 4.5 | 4.5 ± 0.5 (--5,+-4) |
Randoms | Distal | arbitrary negative | 0 | 10 | 0 | 0 |
Randoms | Distal | alternate negative | 0 | 10 | 0 | 0 |
Reals | Distal | positive | 9 | 2 | 4.5 | 4.5 ± 3.5 (-+1,++8) |
Randoms | Distal | arbitrary positive | 0 | 10 | 0 | 0 |
Randoms | Distal | alternate positive | 0 | 10 | 0 | 0 |
Comparison:
The occurrences of real responsive element consensus sequences are greater than the randoms. This suggests that the real responsive element consensus sequences are likely active or activable.
DPE (Matsumoto) samplings
Copying a responsive elements consensus sequence AGTCTC and putting the sequence in "⌘F" finds four between ZNF497 and A1BG or two between ZSCAN22 and A1BG as can be found by the computer programs.
For the Basic programs testing consensus sequence AGTCTC (starting with SuccessablesDPEM.bas) written to compare nucleotide sequences with the sequences on either the template strand (-), or coding strand (+), of the DNA, in the negative direction (-), or the positive direction (+), the programs are, are looking for, and found:
- negative strand, negative direction, looking for AGTCTC, 1, AGTCTC at 3645.
- positive strand, negative direction, looking for AGTCTC, 1, AGTCTC at 1445.
- positive strand, positive direction, looking for AGTCTC, 3, AGTCTC at 2730, AGTCTC at 2700, AGTCTC at 2610.
- negative strand, positive direction, looking for AGTCTC, 1, AGTCTC at 3188.
- complement, negative strand, negative direction, looking for TCAGAG, 1, TCAGAG at 1445.
- complement, positive strand, negative direction, looking for TCAGAG, 1, TCAGAG at 3645.
- complement, positive strand, positive direction, looking for TCAGAG, 1, TCAGAG at 3188.
- complement, negative strand, positive direction, looking for TCAGAG, 3, TCAGAG at 2730, TCAGAG at 2700, TCAGAG at 2610.
- inverse complement, negative strand, negative direction, looking for GAGACT, 0.
- inverse complement, positive strand, negative direction, looking for GAGACT, 4, GAGACT at 4053, GAGACT at 1933, GAGACT at 1081, GAGACT at 915.
- inverse complement, positive strand, positive direction, looking for GAGACT, 2, GAGACT at 3123, GAGACT at 255.
- inverse complement, negative strand, positive direction, looking for GAGACT, 0.
- inverse negative strand, negative direction, looking for CTCTGA, 4, CTCTGA at 4053, CTCTGA at 1933, CTCTGA at 1081, CTCTGA at 915.
- inverse positive strand, negative direction, looking for CTCTGA, 0.
- inverse positive strand, positive direction, looking for CTCTGA, 0.
- inverse negative strand, positive direction, looking for CTCTGA, 2, CTCTGA at 3123, CTCTGA at 255.
DPE (Matsumoto) UTRs
Negative strand, negative direction: AGTCTC at 3645.
Positive strand, negative direction: GAGACT at 4053.
DPE (Matsumoto) distal promoters
Positive strand, negative direction: GAGACT at 1933, AGTCTC at 1445, GAGACT at 1081, GAGACT at 915.
Negative strand, positive direction: AGTCTC at 3188.
Positive strand, positive direction: GAGACT at 3123, AGTCTC at 2730, AGTCTC at 2700, AGTCTC at 2610, GAGACT at 255.
DPE (Matsumoto) random dataset samplings
- DPEMr0: 0.
- DPEMr1: 1, AGTCTC at 905.
- DPEMr2: 1, AGTCTC at 4519.
- DPEMr3: 1, AGTCTC at 3217.
- DPEMr4: 0.
- DPEMr5: 1, AGTCTC at 4010.
- DPEMr6: 0.
- DPEMr7: 0.
- DPEMr8: 0.
- DPEMr9: 0.
- DPEMr0ci: 1, GAGACT at 2272.
- DPEMr1ci: 1, GAGACT at 1413.
- DPEMr2ci: 1, GAGACT at 1114.
- DPEMr3ci: 0.
- DPEMr4ci: 1, GAGACT at 3618.
- DPEMr5ci: 0.
- DPEMr6ci: 2, GAGACT at 3055, GAGACT at 538.
- DPEMr7ci: 1, GAGACT at 4351.
- DPEMr8ci: 1, GAGACT at 4467.
- DPEMr9ci: 0.
DPEMr UTRs
- DPEMr2: AGTCTC at 4519.
- DPEMr4ci: GAGACT at 3618.
- DPEMr6ci: GAGACT at 3055.
- DPEMr8ci: GAGACT at 4467.
DPEMr core promoters
- DPEMr7ci: GAGACT at 4351.
DPEMr distal promoters
- DPEMr0ci: GAGACT at 2272.
- DPEMr2ci: GAGACT at 1114.
- DPEMr6ci: GAGACT at 538.
- DPEMr1: AGTCTC at 905.
- DPEMr3: AGTCTC at 3217.
- DPEMr5: AGTCTC at 4010.
- DPEMr1ci: GAGACT at 1413.
DPE (Matsumoto) analysis and results
The DPE in "the ATP‐binding cassette subfamily G member 2 gene in the marine pufferfish Takifugu rubripes" is 5'-AGTCTC-3'.[8]
The UTR per annotation release 109.2021119 has two consensus sequences, one on each strand, for an occurrence of 1.0. The randoms had one direct and three complement inverses for an occurrence of 0.4.
The real promoters have no core or proximal promoter downstream promoter element such as in the ATP‐binding cassette subfamily G member 2 gene in the marine pufferfish Takifugu rubripes.[8] The randoms had only one core promoter for an occurrence of 0.05.
The real distal promoters have four in the negative direction and six in the positive direction for occurrences of 2.0 and 3.0, respectively. The randoms had three in the arbitrary negative direction and four in the positive direction for occurrences of 0.3 and 0.4, respectively.
By comparison, the reals have systematically higher occurrences for the UTR and distal promoters and lower for the core promoters. This suggests that the real consensus sequences for the DPE described[8] are likely active or activable.
Acknowledgements
The content on this page was first contributed by: Henry A. Hoff.
Initial content for this page in some instances came from Wikiversity.
See also
References
- ↑ Jennifer E.F. Butler, James T. Kadonaga (October 15, 2002). "The RNA polymerase II core promoter: a key component in the regulation of gene expression". Genes & Development. 16 (20): 2583–92. doi:10.1101/gad.1026202. PMID 12381658.
- ↑ 2.0 2.1 2.2 Tamar Juven-Gershon, James T. Kadonaga (March 15, 2010). "Regulation of Gene Expression via the Core Promoter and the Basal Transcriptional Machinery". Developmental Biology. 339 (2): 225–9. doi:10.1016/j.ydbio.2009.08.009. PMC 2830304. PMID 19682982.
- ↑ 3.0 3.1 3.2 3.3 3.4 3.5 3.6 Thomas W. Burke and James T. Kadonaga (November 15, 1997). "The downstream core promoter element, DPE, is conserved from Drosophila to humans and is recognized by TAFII60 of Drosophila". Genes & Development. 11 (22): 3020–31. doi:10.1101/gad.11.22.3020. PMC 316699. PMID 9367984.
- ↑ 4.0 4.1 Stephen T. Smale and James T. Kadonaga (July 2003). "The RNA Polymerase II Core Promoter" (PDF). Annual Review of Biochemistry. 72 (1): 449–79. doi:10.1146/annurev.biochem.72.121801.161520. PMID 12651739. Retrieved 2012-05-07.
- ↑ 5.0 5.1 5.2 T.W. Burke and James T. Kadonaga (15 March 1996). "Drosophila TFIID binds to a conserved downstream basal promoter element that is present in many TATA-box-deficient promoters" (PDF). Genes & Development. 10 (6): 711–724. doi:10.1101/gad.10.6.711. PMID 8598298.
- ↑ Kutach, Alan K.; Kadonaga, James T. (July 2000). "The Downstream Promoter Element DPE Appears To Be as Widely Used as the TATA Box in Drosophila Core Promoters". Molecular and Cellular Biology. 20 (13): 4754–4764. doi:10.1128/MCB.20.13.4754-4764.2000. PMC 85905. PMID 10848601.
- ↑ 7.0 7.1 James T. Kadonaga (September 2002). "The DPE, a core promoter element for transcription by RNA polymerase II" (PDF). Experimental & Molecular Medicine. 34 (4): 259–264. PMID 12515390.
- ↑ 8.0 8.1 8.2 8.3 Takuya Matsumoto, Saemi Kitajima, Chisato Yamamoto, Mitsuru Aoyagi, Yoshiharu Mitoma, Hiroyuki Harada and Yuji Nagashima (9 August 2020). "Cloning and tissue distribution of the ATP-binding cassette subfamily G member 2 gene in the marine pufferfish Takifugu rubripes" (PDF). Fisheries Science. 86: 873–887. doi:10.1007/s12562-020-01451-z. Retrieved 27 September 2020.
- ↑ FlyBase (February 3, 2013). "Antp Antennapedia [ Drosophila melanogaster ]". 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information, U.S. National Library of Medicine. Retrieved 2013-02-07.
- ↑ 10.0 10.1 HGNC (February 5, 2013). "HOXA7 homeobox A7 [ Homo sapiens ]". 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information, U.S. National Library of Medicine. Retrieved 2013-02-07.
- ↑ HGNC (February 5, 2013). "HOXA9 homeobox A9 [ Homo sapiens ]". 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information, U.S. National Library of Medicine. Retrieved 2013-02-07.
Further reading
- Jennifer E.F. Butler, James T. Kadonaga (October 15, 2002). "The RNA polymerase II core promoter: a key component in the regulation of gene expression". Genes & Development. 16 (20): 2583–92. doi:10.1101/gad.1026202. PMID 12381658.