H box gene transcriptions

Jump to navigation Jump to search

Editor-In-Chief: Henry A. Hoff

File:A green bean.jpg
Green beans grow from Phaseolus vulgaris. Credit: wanko from Japan.{{free media}}

The "H-box [is] from the bean chalcone synthase gene Chs15 [23,24]."[1]

The "Phaseolus vulgaris chalcone synthase (PvCHS15)" gene has three H boxes between the G box and the TATA box, where each binds to MYB, KAP 2, and KAP 1 downstream from the G box, respectively.[1]

"Functional studies with the H-box indicated that it cannot function to a high level alone. Gain of function experiments, however, show that it is active in combination with a G-Box element [...] in transgenic tobacco plants in establishing the characteristic tissue-specific pattern of expression and mutations in either the H-box or G-Box reduced the response to tobacco mosaic virus (TMV) infection [24,30]."[1]

"A bZIP protein from soybean binds to the G-Box in the bean Chs15 promoter [36•]. This protein, G/HBF-1, can also bind to the adjacent H-box."[1]

"Although the mRNA and protein levels of G/HBF-1 do not increase during the induction of its putative target genes, the protein itself is rapidly phosphorylated and in vitro phosphorylation enhances binding to one (H-box III) out of the three H-boxes present in the Chs15 promoter."[1]

H box in animals

"A testis/brain RNA-binding protein, TB-RBP, binds to the Y- and H-boxes in the Prm2 3′ UTR and represses translation of a reporter mRNA in rabbit reticulocyte lysates [9]. The Y- and H-boxes are found in many transcripts expressed in the testis and brain, including Prm1, Prm2, Tnp1, and Tau [10]."[2]

H box (Mitchell) consensus sequences

"The box H/ACA snoRNAs were most recently recognized as a small RNA family by virtue of an ACA trinucleotide located 3 nt upstream of the mature snoRNA 3' end (41). In addition to this ACA box, they have the consensus H box sequence (5'-ANANNA-3') but have no other primary sequence identity. Despite this lack of primary sequence conservation, the H and ACA boxes are embedded in an evolutionarily conserved hairpin-hinge-hairpin-tail core secondary structure with the H box in the single-stranded hinge region and the ACA box in the single-stranded tail (5, 16)."[3]

The "3' end of mature hTR (45) has an ACA trinucleotide 3 nt upstream of its 3' end. In addition, the 3' region of hTR contains a single H box consensus sequence (5'-AGAGGA-3')."[3]

"Comparison with the murine telomerase RNA (mTR) (7) suggests that the snoRNA-like features of hTR are evolutionarily conserved. The mTR 3' end (nt 169 to 397 as numbered in reference 25) has ~76% sequence identity with the corresponding region of hTR (nt 211 to 451) and includes consensus H (5'-ACAGGA-3') and ACA box sequences."[3]

H boxes (Mitchell) samplings

For the Basic programs testing consensus sequence 3'-ANANNA-5' (starting with SuccessablesHbox3.bas) written to compare nucleotide sequences with the sequences on either the template strand (-), or coding strand (+), of the DNA, in the negative direction (-), or the positive direction (+), the programs are, are looking for, and found:

  1. Negative strand, negative direction: 64, AAATAA at 4537, AGAGAA at 4527, ACACGA at 4402, ACATCA at 4124, ACATTA at 3973, ACACCA at 3811, AGACGA at 3707, AGAAGA at 3554, ACACCA at 3187, AGATGA at 3159, ACATTA at 3064, AAAGTA at 2886, ATAAAA at 2853, ACATTA at 2675, ACACCA at 2659, ACATCA at 2541, ACATCA at 2340, AGATGA at 2295, AGATGA at 2170, ACATTA at 2088, ACATTA at 1914, AGATGA at 1868, ACATTA at 1779, ATAGAA at 1732, AAAATA at 1729, ATAAAA at 1727, AAAGAA at 1605, ATATAA at 1601, ACACTA at 1480, AGACAA at 1453, AAAAAA at 1432, AAAAAA at 1431, AAAAAA at 1430, AAAAAA at 1429, AAAAAA at 1428, AAAAAA at 1427, AAAAAA at 1426, AAAAAA at 1425, AAAAAA at 1424, AAAAAA at 1423, AAAAAA at 1422, AAAAAA at 1421, AGAAAA at 1419, ACATTA at 1261, ACATTA at 1134, ACATTA at 804, ACACCA at 788, AGATGA at 759, ACATTA at 670, AGATGA at 625, ATACCA at 606, ACATTA at 397, ATACTA at 352, ACATGA at 325, AGAACA at 281, ACATTA at 248, AGATAA at 235, AAAATA at 218, ACAAAA at 215, ATACAA at 213, ACAAGA at 45, ATACAA at 43, AGAAAA at 26, AAAGAA at 24.
  2. Negative strand, positive direction: 32, ACATGA at 4154, ACATCA at 4116, AAATGA at 4094, AGAACA at 4068, AAAAGA at 3929, ACACCA at 3825, ACATGA at 3708, AAAGCA at 3599, ACAGGA at 3572, AGATGA at 3476, ACAGTA at 3414, ACAGCA at 3212, AGAGCA at 3138, AGAACA at 3094, AAAGAA at 3066, ACAGAA at 2838, AAAGGA at 2829, AGAGGA at 2793, AGAGCA at 2704, ATATAA at 2662, ACACTA at 2637, AAACCA at 2632, ATAGAA at 2628, ACACCA at 2603, ATACCA at 2591, AGATCA at 2231, ACATGA at 2141, AAAGCA at 2006, ACAGCA at 1055, AGAGGA at 471, AGAGGA at 207, AGAAGA at 49.
  3. Positive strand, negative direction: 263, ACACGA at 4471, AGAAAA at 4395, AAAGAA at 4393, AAAAGA at 4392, AGAAAA at 4390, AAAGAA at 4388, AAAAGA at 4387, AAAAAA at 4385, AGAAAA at 4383, AAAGAA at 4381, AAAAGA at 4380, AAAAAA at 4378, ATAATA at 4223, AAATAA at 4221, AAAATA at 4220, AAAAAA at 4218, ACAAAA at 4216, AGACAA at 4182, AGAAAA at 4086, AAAGAA at 4084, ATAGAA at 4080, ATAATA at 4077, AAATAA at 4075, AAATAA at 4071, AAAATA at 4070, AAAAAA at 4068, ACAAAA at 4066, AGACCA at 4031, AGATGA at 3920, AGAGCA at 3913, ACAAAA at 3767, AGACCA at 3762, ACAAGA at 3759, AGAGGA at 3675, AGAACA at 3668, AAAGAA at 3666, AGAGGA at 3638, ACAAGA at 3635, ATAATA at 3538, ACACAA at 3514, AAAACA at 3511, AGATCA at 3489, AAACCA at 3484, ATATTA at 3468, ATATTA at 3454, ACATTA at 3436, ACATCA at 3415, AGAGAA at 3406, ACATCA at 3394, AGAGGA at 3387, AAACCA at 3365, AGAAAA at 3343, ACAAGA at 3340, AAACAA at 3338, AAATAA at 3334, AAACAA at 3330, AAAACA at 3329, AGAGCA at 3310, ACAAGA at 3307, AGATCA at 3277, AAATTA at 3175, ATAAAA at 3171, ACATAA at 3169, AAAACA at 3166, AGACCA at 3122, AAACTA at 3029, AAAAAA at 3026, AAATAA at 3013, AAAATA at 3012, AAACCA at 2971, ACATTA at 2951, ACATCA at 2941, AAAAAA at 2929, ATATAA at 2873, AAAATA at 2868, AAACAA at 2842, AAAACA at 2841, AGAAAA at 2839, AAAGAA at 2837, AAAAGA at 2836, AAAAAA at 2834, AGAAAA at 2832, AGAAGA at 2829, AGAGAA at 2827, AAAAGA at 2824, AGAAAA at 2822, AAAGAA at 2820, AAAAGA at 2819, AAAAAA at 2817, AGAAAA at 2815, AGAAGA at 2812, AGAGAA at 2810, AAAAGA at 2807, AGAAAA at 2805, AAAGAA at 2803, AAAGAA at 2799, AAAAGA at 2798, AGAGCA at 2781, AAATCA at 2749, ACAGGA at 2690, AAATCA at 2648, ACAAAA at 2644, ATACAA at 2642, AGACCA at 2599, AAACAA at 2509, AAAACA at 2508, AGAAAA at 2506, ATAGTA at 2500, ACAAAA at 2490, AAACAA at 2488, AAACAA at 2484, AAAGCA at 2479, AAAGCA at 2473, AAAAAA at 2470, AAAAAA at 2469, AAAAAA at 2468, AAAAAA at 2467, AAAAAA at 2466, AAAAAA at 2465, AAAAAA at 2464, AAAAAA at 2463, AAAAAA at 2462, AAAAAA at 2461, ACACCA at 2419, AGATCA at 2414, AAACTA at 2312, AAAAAA at 2309, ACAAAA at 2307, ATACAA at 2305, AAAATA at 2302, ACAGCA at 2274, AGACCA at 2262, AAATGA at 2187, AAAAAA at 2184, ACAAAA at 2182, ATACAA at 2180, AGACCA at 2146, AGACCA at 2122, AAAAAA at 2060, AAAAAA at 2059, AAAAAA at 2058, AGAAAA at 2056, AAAGAA at 2054, AAAAGA at 2053, AAAAAA at 2051, AAAAAA at 2050, AAAAAA at 2049, AAAAAA at 2048, AAAAAA at 2047, AAAAAA at 2046, AAAAAA at 2045, AAAAAA at 2044, AAAAAA at 2043, AAAAAA at 2042, AAAAAA at 2041, AAAAAA at 2040, AAAAAA at 2039, AAAAAA at 2038, AGAGCA at 2020, AGATCA at 1988, AAATTA at 1886, AAAAAA at 1883, AAAAAA at 1882, ACAAAA at 1880, ATACAA at 1878, AAAATA at 1875, AGACCA at 1835, AAAATA at 1739, ATAGTA at 1705, AAATGA at 1700, ATACCA at 1668, AAATGA at 1663, AAAGGA at 1640, AAAGAA at 1629, AAAAGA at 1628, AAACAA at 1585, AAATGA at 1580, AAAATA at 1563, AAAGAA at 1550, AAAAGA at 1400, AAAAAA at 1398, AAAAAA at 1397, AAAAAA at 1396, ACAAAA at 1394, AAACAA at 1392, AAACAA at 1388, AAAACA at 1387, AGAGCA at 1368, ATAAGA at 1365, AAATTA at 1233, AAAAAA at 1230, ACAAAA at 1228, AAAAAA at 1105, AAAAAA at 1104, AAAAAA at 1103, AAAAAA at 1102, AAAAAA at 1101, AAAAAA at 1100, AAAAAA at 1099, AAAAAA at 1098, AAAAAA at 1097, AAAAAA at 1096, AAAAAA at 1095, AAAAAA at 1094, ACAACA at 1071, AAAAAA at 942, AAAAAA at 941, AAAAAA at 940, AAAAAA at 939, AAAAAA at 938, AAAAAA at 937, AAAAAA at 936, AAAAAA at 935, AAAAAA at 934, AAAAAA at 933, AAAAAA at 932, AAAAAA at 931, AAAAAA at 930, AAAAAA at 929, AAAAAA at 928, ACACCA at 883, AGATCA at 878, AAATTA at 776, AAAAAA at 773, ACAAAA at 771, ATACAA at 769, AAAATA at 766, AGACCA at 726, AAAAAA at 639, ACAAAA at 637, ATACAA at 635, AAAATA at 632, AGATCA at 590, AAATTA at 498, ATACGA at 492, AAAATA at 489, AAAAAA at 487, ACAAAA at 485, AAAACA at 360, AGAAAA at 358, ATAGAA at 356, ACAGAA at 290, AGAACA at 287, ATATGA at 274, ATAATA at 271, ACATAA at 269, AAACCA at 260, AAACAA at 229, AAAGAA at 225, AAAAGA at 224, ATAAAA at 222, AAAGCA at 186, ATAAAA at 183, ACATTA at 173, AAAACA at 166, ATACAA at 113, AAAGGA at 106, AGAAAA at 103, ATAGAA at 101, AAACAA at 69, AAAACA at 68, AAAAGA at 55, AGAAAA at 53.
  4. Positive strand, positive direction: 42, AGAGAA at 4387, ATATTA at 4168, AAATAA at 4142, AAATCA at 4137, AAAATA at 4122, AGAGGA at 4059, ACACCA at 3967, AAACCA at 3948, ACACCA at 3643, ACAGGA at 3620, AGAAGA at 3395, ACAGAA at 3393, AGACGA at 3307, AGAGGA at 3302, AGAAGA at 3058, AGAGAA at 3056, AGACCA at 3022, AGACGA at 2976, AAAACA at 2453, AAAAAA at 2451, AAATAA at 2347, AAAAAA at 2281, AGAAAA at 2279, AAAGAA at 2277, AAAAGA at 2276, AAAGTA at 2265, AGACAA at 2261, AGACAA at 2183, ATAAGA at 2180, AGAGTA at 2175, AGATCA at 2168, AGAGGA at 2081, AAAGAA at 1980, AGACGA at 1734, AAAGCA at 1182, ACAAAA at 147, AGAGGA at 142, AAAAGA at 137, ATAAGA at 117, ACATAA at 115, ACAAGA at 108, AGACCA at 103.
  5. Inverse complement, negative strand, negative direction: 270, TTGTCT at 4518, TTCTGT at 4507, TGCTCT at 4473, TCCTGT at 4468, TCTTTT at 4395, TTCTTT at 4394, TTTTCT at 4392, TCTTTT at 4390, TTCTTT at 4389, TTTTCT at 4387, TTTTTT at 4385, TCTTTT at 4383, TTCTTT at 4382, TTTTCT at 4380, TTTTTT at 4378, TATTAT at 4223, TTTTAT at 4220, TTTTTT at 4218, TGTTTT at 4216, TTGTGT at 4196, TTCTGT at 4181, TCTTTT at 4086, TTCTTT at 4085, TTATCT at 4079, TATTAT at 4077, TTATTT at 4072, TTTTAT at 4070, TTTTTT at 4068, TGTTTT at 4066, TTGTAT at 4045, TACTTT at 3922, TCGTGT at 3915, TGTTTT at 3767, TGGTGT at 3764, TGTTCT at 3759, TCCTGT at 3756, TTGTGT at 3670, TCTTGT at 3668, TGTTCT at 3635, TAGTCT at 3618, TATTAT at 3538, TTGTGT at 3513, TTTTGT at 3511, TCGTTT at 3497, TGGTCT at 3486, TCATTT at 3481, TGATCT at 3463, TAATTT at 3438, TAGTAT at 3420, TCCTGT at 3389, TTCTCT at 3380, TTCTTT at 3376, TCTTTT at 3343, TTCTTT at 3342, TGTTCT at 3340, TTATTT at 3335, TTGTTT at 3331, TTTTGT at 3329, TCGTTT at 3312, TGTTCT at 3307, TGCTCT at 3233, TATTTT at 3171, TTGTAT at 3168, TTTTGT at 3166, TGATTT at 3162, TTTTTT at 3026, TTATTT at 3014, TTTTAT at 3012, TAATCT at 3000, TAATCT at 2979, TCCTTT at 2967, TCCTTT at 2957, TAGTCT at 2946, TTTTTT at 2929, TAGTTT at 2890, TTGTCT at 2878, TTATAT at 2870, TTTTAT at 2868, TTTTGT at 2841, TCTTTT at 2839, TTCTTT at 2838, TTTTCT at 2836, TTTTTT at 2834, TCTTTT at 2832, TTCTTT at 2831, TCTTCT at 2829, TTCTCT at 2826, TTTTCT at 2824, TCTTTT at 2822, TTCTTT at 2821, TTTTCT at 2819, TTTTTT at 2817, TCTTTT at 2815, TTCTTT at 2814, TCTTCT at 2812, TTCTCT at 2809, TTTTCT at 2807, TCTTTT at 2805, TTCTTT at 2804, TTCTTT at 2800, TTTTCT at 2798, TTGTCT at 2778, TGTTTT at 2644, TTATAT at 2639, TGATTT at 2635, TTGTTT at 2510, TTTTGT at 2508, TCTTTT at 2506, TTCTTT at 2505, TGTTTT at 2490, TTGTTT at 2489, TTGTTT at 2485, TCGTTT at 2481, TCGTTT at 2475, TTTTTT at 2470, TTTTTT at 2469, TTTTTT at 2468, TTTTTT at 2467, TTTTTT at 2466, TTTTTT at 2465, TTTTTT at 2464, TTTTTT at 2463, TTTTTT at 2462, TTTTTT at 2461, TTGTCT at 2443, TAGTGT at 2416, TCCTCT at 2370, TTTTTT at 2309, TGTTTT at 2307, TTATGT at 2304, TTTTAT at 2302, TGATTT at 2298, TAGTGT at 2242, TTTTTT at 2184, TGTTTT at 2182, TTCTAT at 2177, TGATTT at 2173, TTTTTT at 2060, TTTTTT at 2059, TTTTTT at 2058, TCTTTT at 2056, TTCTTT at 2055, TTTTCT at 2053, TTTTTT at 2051, TTTTTT at 2050, TTTTTT at 2049, TTTTTT at 2048, TTTTTT at 2047, TTTTTT at 2046, TTTTTT at 2045, TTTTTT at 2044, TTTTTT at 2043, TTTTTT at 2042, TTTTTT at 2041, TTTTTT at 2040, TTTTTT at 2039, TTTTTT at 2038, TCCTCT at 1944, TTTTTT at 1883, TTTTTT at 1882, TGTTTT at 1880, TTATGT at 1877, TTTTAT at 1875, TGATTT at 1871, TCCTCT at 1826, TTTTAT at 1739, TTATCT at 1710, TACTAT at 1702, TAATTT at 1697, TGGTCT at 1670, TACTAT at 1665, TCCTTT at 1642, TTCTTT at 1630, TTTTCT at 1628, TCGTCT at 1614, TTCTAT at 1595, TTGTTT at 1586, TACTTT at 1582, TTATGT at 1565, TTTTAT at 1563, TTGTGT at 1541, TTTTCT at 1400, TTTTTT at 1398, TTTTTT at 1397, TTTTTT at 1396, TGTTTT at 1394, TTGTTT at 1393, TTGTTT at 1389, TTTTGT at 1387, TCGTTT at 1370, TATTCT at 1365, TCCTCT at 1291, TTTTTT at 1230, TGTTTT at 1228, TTTTTT at 1105, TTTTTT at 1104, TTTTTT at 1103, TTTTTT at 1102, TTTTTT at 1101, TTTTTT at 1100, TTTTTT at 1099, TTTTTT at 1098, TTTTTT at 1097, TTTTTT at 1096, TTTTTT at 1095, TTTTTT at 1094, TTGTCT at 1073, TGTTGT at 1071, TCCTCT at 1000, TTTTTT at 942, TTTTTT at 941, TTTTTT at 940, TTTTTT at 939, TTTTTT at 938, TTTTTT at 937, TTTTTT at 936, TTTTTT at 935, TTTTTT at 934, TTTTTT at 933, TTTTTT at 932, TTTTTT at 931, TTTTTT at 930, TTTTTT at 929, TTTTTT at 928, TTGTCT at 907, TAGTGT at 880, TCCTCT at 834, TTTTTT at 773, TGTTTT at 771, TTATGT at 768, TTTTAT at 766, TGATTT at 762, TTTTTT at 639, TGTTTT at 637, TTATGT at 634, TTTTAT at 632, TGATTT at 628, TCCTCT at 581, TTCTGT at 559, TAGTGT at 528, TGCTTT at 494, TTTTAT at 489, TTTTTT at 487, TGTTTT at 485, TTGTAT at 467, TTCTGT at 422, TTTTGT at 360, TCTTTT at 358, TACTAT at 353, TGCTTT at 312, TAGTGT at 295, TTGTCT at 289, TCTTGT at 287, TATTAT at 271, TTCTTT at 226, TTTTCT at 224, TATTTT at 222, TATTTT at 183, TTGTCT at 168, TTTTGT at 166, TTCTTT at 135, TACTTT at 126, TCCTAT at 108, TCTTTT at 103, TCCTAT at 74, TTTTGT at 68, TTCTAT at 57, TTTTCT at 55, TCTTTT at 53, TTGTCT at 13.
  6. Inverse complement, negative strand, positive direction: 55, TGCTGT at 4392, TTCTCT at 4386, TGGTCT at 4380, TCATGT at 4365, TAGTTT at 4139, TGATTT at 4134, TTTTAT at 4122, TCATTT at 4119, TGGTTT at 4108, TGGTGT at 3969, TGGTGT at 3950, TCCTGT at 3622, TTCTTT at 3397, TCTTCT at 3395, TCCTCT at 3304, TGGTCT at 3299, TGGTCT at 3245, TCTTCT at 3058, TTGTCT at 3053, TTGTCT at 3004, TCCTCT at 2981, TTCTGT at 2957, TGGTCT at 2941, TTCTGT at 2925, TGGTGT at 2813, TTCTTT at 2585, TTTTGT at 2453, TTTTTT at 2451, TAATTT at 2440, TTTTTT at 2281, TCTTTT at 2279, TTCTTT at 2278, TTTTCT at 2276, TTCTGT at 2182, TATTCT at 2180, TCATAT at 2177, TAGTGT at 2170, TGCTAT at 2157, TCATCT at 2111, TTCTCT at 1990, TTCTTT at 1981, TCGTCT at 1493, TCGTCT at 1393, TCCTCT at 221, TGTTTT at 147, TCCTGT at 144, TTCTCT at 139, TTTTCT at 137, TTCTCT at 119, TATTCT at 117, TTGTAT at 114, TTCTTT at 110, TGTTCT at 108, TGGTGT at 105, TCGTGT at 80.
  7. Inverse complement, positive strand, negative direction: 50, TCCTCT at 4428, TGCTCT at 4404, TCATCT at 4058, TGCTGT at 3957, TGATGT at 3808, TCCTCT at 3790, TGCTGT at 3709, TTCTGT at 3556, TCTTCT at 3554, TTCTGT at 3319, TGCTGT at 3265, TGCTAT at 2898, TATTTT at 2853, TACTTT at 2216, TACTGT at 2163, TCCTGT at 1911, TTATCT at 1731, TTTTAT at 1729, TATTTT at 1727, TTATTT at 1726, TTCTAT at 1525, TGATCT at 1482, TGGTGT at 1477, TTTTTT at 1432, TTTTTT at 1431, TTTTTT at 1430, TTTTTT at 1429, TTTTTT at 1428, TTTTTT at 1427, TTTTTT at 1426, TTTTTT at 1425, TTTTTT at 1424, TTTTTT at 1423, TTTTTT at 1422, TTTTTT at 1421, TCTTTT at 1419, TTCTTT at 1418, TGGTGT at 793, TTCTCT at 622, TGGTGT at 608, TAATAT at 603, TTCTTT at 347, TCTTGT at 281, TAATAT at 272, TTTTAT at 218, TGTTTT at 215, TTCTTT at 47, TGTTCT at 45, TCTTTT at 26, TTCTTT at 25.
  8. Inverse complement, positive strand, positive direction: 40, TCCTGT at 4252, TAATAT at 4166, TAGTAT at 4149, TCTTGT at 4068, TTTTCT at 3929, TGGTGT at 3859, TCGTGT at 3740, TCCTCT at 3650, TACTGT at 3569, TGGTCT at 3548, TTATTT at 3427, TCATCT at 3416, TCGTCT at 3214, TTGTCT at 3179, TGGTTT at 3175, TCCTGT at 3131, TTGTGT at 3096, TCTTGT at 3094, TACTGT at 2843, TTGTGT at 2835, TCCTTT at 2831, TCGTTT at 2706, TTGTCT at 2652, TGGTGT at 2634, TTATCT at 2627, TCCTTT at 2623, TGGTGT at 2600, TAATAT at 2548, TCCTGT at 2460, TGATGT at 2428, TACTGT at 2412, TGGTCT at 2228, TACTTT at 2146, TGGTGT at 2123, TGCTAT at 1837, TGGTCT at 1631, TCCTCT at 710, TACTGT at 62, TCTTCT at 49, TCCTCT at 46.

H boxes (Mitchell) UTRs

  1. Negative strand, negative direction: AAATAA at 4537, AGAGAA at 4527, TTGTCT at 4518, TTCTGT at 4507, TGCTCT at 4473, TCCTGT at 4468, TCCTCT at 4428, TGCTCT at 4404, ACACGA at 4402, TCTTTT at 4395, TTCTTT at 4394, TTTTCT at 4392, TCTTTT at 4390, TTCTTT at 4389, TTTTCT at 4387, TTTTTT at 4385, TCTTTT at 4383, TTCTTT at 4382, TTTTCT at 4380, TTTTTT at 4378, TATTAT at 4223, TTTTAT at 4220, TTTTTT at 4218, TGTTTT at 4216, TTGTGT at 4196, TTCTGT at 4181, ACATCA at 4124, TCTTTT at 4086, TTCTTT at 4085, TTATCT at 4079, TATTAT at 4077, TTATTT at 4072, TTTTAT at 4070, TTTTTT at 4068, TGTTTT at 4066, TCATCT at 4058, TTGTAT at 4045, ACATTA at 3973, TGCTGT at 3957, TACTTT at 3922, TCGTGT at 3915, ACACCA at 3811, TGATGT at 3808, TCCTCT at 3790, TGTTTT at 3767, TGGTGT at 3764, TGTTCT at 3759, TCCTGT at 3756, TGCTGT at 3709, AGACGA at 3707, TTGTGT at 3670, TCTTGT at 3668, TGTTCT at 3635, TAGTCT at 3618, TTCTGT at 3556, AGAAGA at 3554, TATTAT at 3538, TTGTGT at 3513, TTTTGT at 3511, TCGTTT at 3497, TGGTCT at 3486, TCATTT at 3481, TGATCT at 3463, TAATTT at 3438, TAGTAT at 3420, TCCTGT at 3389, TTCTCT at 3380, TTCTTT at 3376, TCTTTT at 3343, TTCTTT at 3342, TGTTCT at 3340, TTATTT at 3335, TTGTTT at 3331, TTTTGT at 3329, TTCTGT at 3319, TCGTTT at 3312, TGTTCT at 3307, TGCTGT at 3265, TGCTCT at 3233, ACACCA at 3187, TATTTT at 3171, TTGTAT at 3168, TTTTGT at 3166, TGATTT at 3162, AGATGA at 3159, ACATTA at 3064, TTTTTT at 3026, TTATTT at 3014, TTTTAT at 3012, TAATCT at 3000, TAATCT at 2979, TCCTTT at 2967, TCCTTT at 2957, TAGTCT at 2946, TTTTTT at 2929, TGCTAT at 2898, TAGTTT at 2890, AAAGTA at 2886, TTGTCT at 2878, TTATAT at 2870, TTTTAT at 2868, ATAAAA at 2853.
  2. Positive strand, negative direction: ACACGA at 4471, AGAAAA at 4395, AAAGAA at 4393, AAAAGA at 4392, AGAAAA at 4390, AAAGAA at 4388, AAAAGA at 4387, AAAAAA at 4385, AGAAAA at 4383, AAAGAA at 4381, AAAAGA at 4380, AAAAAA at 4378, ATAATA at 4223, AAATAA at 4221, AAAATA at 4220, AAAAAA at 4218, ACAAAA at 4216, AGACAA at 4182, AGAAAA at 4086, AAAGAA at 4084, ATAGAA at 4080, ATAATA at 4077, AAATAA at 4075, AAATAA at 4071, AAAATA at 4070, AAAAAA at 4068, ACAAAA at 4066, AGACCA at 4031, AGATGA at 3920, AGAGCA at 3913, ACAAAA at 3767, AGACCA at 3762, ACAAGA at 3759, AGAGGA at 3675, AGAACA at 3668, AAAGAA at 3666, AGAGGA at 3638, ACAAGA at 3635, ATAATA at 3538, ACACAA at 3514, AAAACA at 3511, AGATCA at 3489, AAACCA at 3484, ATATTA at 3468, ATATTA at 3454, ACATTA at 3436, ACATCA at 3415, AGAGAA at 3406, ACATCA at 3394, AGAGGA at 3387, AAACCA at 3365, AGAAAA at 3343, ACAAGA at 3340, AAACAA at 3338, AAATAA at 3334, AAACAA at 3330, AAAACA at 3329, AGAGCA at 3310, ACAAGA at 3307, AGATCA at 3277, AAATTA at 3175, ATAAAA at 3171, ACATAA at 3169, AAAACA at 3166, AGACCA at 3122, AAACTA at 3029, AAAAAA at 3026, AAATAA at 3013, AAAATA at 3012, AAACCA at 2971, ACATTA at 2951, ACATCA at 2941, AAAAAA at 2929, ATATAA at 2873, AAAATA at 2868.

H boxes (Mitchell) core promoters

  1. Negative strand, negative direction: TTTTGT at 2841, TCTTTT at 2839, TTCTTT at 2838, TTTTCT at 2836, TTTTTT at 2834, TCTTTT at 2832, TTCTTT at 2831, TCTTCT at 2829, TTCTCT at 2826, TTTTCT at 2824, TCTTTT at 2822, TTCTTT at 2821, TTTTCT at 2819, TTTTTT at 2817, TCTTTT at 2815, TTCTTT at 2814, TCTTCT at 2812.
  2. Positive strand, negative direction: AAACAA at 2842, AAAACA at 2841, AGAAAA at 2839, AAAGAA at 2837, AAAAGA at 2836, AAAAAA at 2834, AGAAAA at 2832, AGAAGA at 2829, AGAGAA at 2827, AAAAGA at 2824, AGAAAA at 2822, AAAGAA at 2820, AAAAGA at 2819, AAAAAA at 2817, AGAAAA at 2815, AGAAGA at 2812.


  1. Negative strand, positive direction: TGCTGT at 4392, TTCTCT at 4386, TGGTCT at 4380, TCATGT at 4365.
  2. Positive strand, positive direction: AGAGAA at 4387.

H boxes (Mitchell) proximal promoters

  1. Negative strand, negative direction: TTCTCT at 2809, TTTTCT at 2807, TCTTTT at 2805, TTCTTT at 2804, TTCTTT at 2800, TTTTCT at 2798, TTGTCT at 2778, ACATTA at 2675, ACACCA at 2659, TGTTTT at 2644, TTATAT at 2639, TGATTT at 2635.
  2. Positive strand, negative direction: AGAGAA at 2810, AAAAGA at 2807, AGAAAA at 2805, AAAGAA at 2803, AAAGAA at 2799, AAAAGA at 2798, AGAGCA at 2781, AAATCA at 2749, ACAGGA at 2690, AAATCA at 2648, ACAAAA at 2644, ATACAA at 2642, AGACCA at 2599.


  1. Negative strand, positive direction: ACATGA at 4154, TAGTTT at 4139, TGATTT at 4134, TTTTAT at 4122, TCATTT at 4119, ACATCA at 4116, TGGTTT at 4108, AAATGA at 4094, AGAACA at 4068.
  2. Positive strand, positive direction: TCCTGT at 4252, ATATTA at 4168, TAATAT at 4166, TAGTAT at 4149, AAATAA at 4142, AAATCA at 4137, AAAATA at 4122, TCTTGT at 4068, AGAGGA at 4059.

H boxes (Mitchell) distal promoters

  1. Negative strand, negative direction: ACATCA at 2541, TTGTTT at 2510, TTTTGT at 2508, TCTTTT at 2506, TTCTTT at 2505, TGTTTT at 2490, TTGTTT at 2489, TTGTTT at 2485, TCGTTT at 2481, TCGTTT at 2475, TTTTTT at 2470, TTTTTT at 2469, TTTTTT at 2468, TTTTTT at 2467, TTTTTT at 2466, TTTTTT at 2465, TTTTTT at 2464, TTTTTT at 2463, TTTTTT at 2462, TTTTTT at 2461, TTGTCT at 2443, TAGTGT at 2416, TCCTCT at 2370, ACATCA at 2340, TTTTTT at 2309, TGTTTT at 2307, TTATGT at 2304, TTTTAT at 2302, TGATTT at 2298, AGATGA at 2295, TAGTGT at 2242, TTTTTT at 2184, TGTTTT at 2182, TTCTAT at 2177, TGATTT at 2173, AGATGA at 2170, ACATTA at 2088, TTTTTT at 2060, TTTTTT at 2059, TTTTTT at 2058, TCTTTT at 2056, TTCTTT at 2055, TTTTCT at 2053, TTTTTT at 2051, TTTTTT at 2050, TTTTTT at 2049, TTTTTT at 2048, TTTTTT at 2047, TTTTTT at 2046, TTTTTT at 2045, TTTTTT at 2044, TTTTTT at 2043, TTTTTT at 2042, TTTTTT at 2041, TTTTTT at 2040, TTTTTT at 2039, TTTTTT at 2038, TCCTCT at 1944, ACATTA at 1914, TTTTTT at 1883, TTTTTT at 1882, TGTTTT at 1880, TTATGT at 1877, TTTTAT at 1875, TGATTT at 1871, AGATGA at 1868, TCCTCT at 1826, ACATTA at 1779, TTTTAT at 1739, ATAGAA at 1732, AAAATA at 1729, ATAAAA at 1727, TTATCT at 1710, TACTAT at 1702, TAATTT at 1697, TGGTCT at 1670, TACTAT at 1665, TCCTTT at 1642, TTCTTT at 1630, TTTTCT at 1628, TCGTCT at 1614, AAAGAA at 1605, ATATAA at 1601, TTCTAT at 1595, TTGTTT at 1586, TACTTT at 1582, TTATGT at 1565, TTTTAT at 1563, TTGTGT at 1541, ACACTA at 1480, AGACAA at 1453, AAAAAA at 1432, AAAAAA at 1431, AAAAAA at 1430, AAAAAA at 1429, AAAAAA at 1428, AAAAAA at 1427, AAAAAA at 1426, AAAAAA at 1425, AAAAAA at 1424, AAAAAA at 1423, AAAAAA at 1422, AAAAAA at 1421, AGAAAA at 1419, TTTTCT at 1400, TTTTTT at 1398, TTTTTT at 1397, TTTTTT at 1396, TGTTTT at 1394, TTGTTT at 1393, TTGTTT at 1389, TTTTGT at 1387, TCGTTT at 1370, TATTCT at 1365, TCCTCT at 1291, ACATTA at 1261, TTTTTT at 1230, TGTTTT at 1228, ACATTA at 1134, TTTTTT at 1105, TTTTTT at 1104, TTTTTT at 1103, TTTTTT at 1102, TTTTTT at 1101, TTTTTT at 1100, TTTTTT at 1099, TTTTTT at 1098, TTTTTT at 1097, TTTTTT at 1096, TTTTTT at 1095, TTTTTT at 1094, TTGTCT at 1073, TGTTGT at 1071, TCCTCT at 1000, TTTTTT at 942, TTTTTT at 941, TTTTTT at 940, TTTTTT at 939, TTTTTT at 938, TTTTTT at 937, TTTTTT at 936, TTTTTT at 935, TTTTTT at 934, TTTTTT at 933, TTTTTT at 932, TTTTTT at 931, TTTTTT at 930, TTTTTT at 929, TTTTTT at 928, TTGTCT at 907, TAGTGT at 880, TCCTCT at 834, ACATTA at 804, ACACCA at 788, TTTTTT at 773, TGTTTT at 771, TTATGT at 768, TTTTAT at 766, TGATTT at 762, AGATGA at 759, ACATTA at 670, TTTTTT at 639, TGTTTT at 637, TTATGT at 634, TTTTAT at 632, TGATTT at 628, AGATGA at 625, ATACCA at 606, TCCTCT at 581, TTCTGT at 559, TAGTGT at 528, TGCTTT at 494, TTTTAT at 489, TTTTTT at 487, TGTTTT at 485, TTGTAT at 467, TTCTGT at 422, ACATTA at 397, TTTTGT at 360, TCTTTT at 358, TACTAT at 353, ATACTA at 352, ACATGA at 325, TGCTTT at 312, TAGTGT at 295, TTGTCT at 289, TCTTGT at 287, AGAACA at 281, TATTAT at 271, ACATTA at 248, AGATAA at 235, TTCTTT at 226, TTTTCT at 224, TATTTT at 222, AAAATA at 218, ACAAAA at 215, ATACAA at 213, TATTTT at 183, TTGTCT at 168, TTTTGT at 166, TTCTTT at 135, TACTTT at 126, TCCTAT at 108, TCTTTT at 103, TCCTAT at 74, TTTTGT at 68, TTCTAT at 57, TTTTCT at 55, TCTTTT at 53, ACAAGA at 45, ATACAA at 43, AGAAAA at 26, AAAGAA at 24, TTGTCT at 13.
  2. Positive strand, negative direction: AAACAA at 2509, AAAACA at 2508, AGAAAA at 2506, ATAGTA at 2500, ACAAAA at 2490, AAACAA at 2488, AAACAA at 2484, AAAGCA at 2479, AAAGCA at 2473, AAAAAA at 2470, AAAAAA at 2469, AAAAAA at 2468, AAAAAA at 2467, AAAAAA at 2466, AAAAAA at 2465, AAAAAA at 2464, AAAAAA at 2463, AAAAAA at 2462, AAAAAA at 2461, ACACCA at 2419, AGATCA at 2414, AAACTA at 2312, AAAAAA at 2309, ACAAAA at 2307, ATACAA at 2305, AAAATA at 2302, ACAGCA at 2274, AGACCA at 2262, TACTTT at 2216, AAATGA at 2187, AAAAAA at 2184, ACAAAA at 2182, ATACAA at 2180, TACTGT at 2163, AGACCA at 2146, AGACCA at 2122, AAAAAA at 2060, AAAAAA at 2059, AAAAAA at 2058, AGAAAA at 2056, AAAGAA at 2054, AAAAGA at 2053, AAAAAA at 2051, AAAAAA at 2050, AAAAAA at 2049, AAAAAA at 2048, AAAAAA at 2047, AAAAAA at 2046, AAAAAA at 2045, AAAAAA at 2044, AAAAAA at 2043, AAAAAA at 2042, AAAAAA at 2041, AAAAAA at 2040, AAAAAA at 2039, AAAAAA at 2038, AGAGCA at 2020, AGATCA at 1988, TCCTGT at 1911, AAATTA at 1886, AAAAAA at 1883, AAAAAA at 1882, ACAAAA at 1880, ATACAA at 1878, AAAATA at 1875, AGACCA at 1835, AAAATA at 1739, TTATCT at 1731, TTTTAT at 1729, TATTTT at 1727, TTATTT at 1726, ATAGTA at 1705, AAATGA at 1700, ATACCA at 1668, AAATGA at 1663, AAAGGA at 1640, AAAGAA at 1629, AAAAGA at 1628, AAACAA at 1585, AAATGA at 1580, AAAATA at 1563, AAAGAA at 1550, TTCTAT at 1525, TGATCT at 1482, TGGTGT at 1477, TTTTTT at 1432, TTTTTT at 1431, TTTTTT at 1430, TTTTTT at 1429, TTTTTT at 1428, TTTTTT at 1427, TTTTTT at 1426, TTTTTT at 1425, TTTTTT at 1424, TTTTTT at 1423, TTTTTT at 1422, TTTTTT at 1421, TCTTTT at 1419, TTCTTT at 1418, AAAAGA at 1400, AAAAAA at 1398, AAAAAA at 1397, AAAAAA at 1396, ACAAAA at 1394, AAACAA at 1392, AAACAA at 1388, AAAACA at 1387, AGAGCA at 1368, ATAAGA at 1365, AAATTA at 1233, AAAAAA at 1230, ACAAAA at 1228, AAAAAA at 1105, AAAAAA at 1104, AAAAAA at 1103, AAAAAA at 1102, AAAAAA at 1101, AAAAAA at 1100, AAAAAA at 1099, AAAAAA at 1098, AAAAAA at 1097, AAAAAA at 1096, AAAAAA at 1095, AAAAAA at 1094, ACAACA at 1071, AAAAAA at 942, AAAAAA at 941, AAAAAA at 940, AAAAAA at 939, AAAAAA at 938, AAAAAA at 937, AAAAAA at 936, AAAAAA at 935, AAAAAA at 934, AAAAAA at 933, AAAAAA at 932, AAAAAA at 931, AAAAAA at 930, AAAAAA at 929, AAAAAA at 928, ACACCA at 883, AGATCA at 878, TGGTGT at 793, AAATTA at 776, AAAAAA at 773, ACAAAA at 771, ATACAA at 769, AAAATA at 766, AGACCA at 726, AAAAAA at 639, ACAAAA at 637, ATACAA at 635, AAAATA at 632, TTCTCT at 622, TGGTGT at 608, TAATAT at 603, AGATCA at 590, AAATTA at 498, ATACGA at 492, AAAATA at 489, AAAAAA at 487, ACAAAA at 485, AAAACA at 360, AGAAAA at 358, ATAGAA at 356, TTCTTT at 347, ACAGAA at 290, AGAACA at 287, TCTTGT at 281, ATATGA at 274, TAATAT at 272, ATAATA at 271, ACATAA at 269, AAACCA at 260, AAACAA at 229, AAAGAA at 225, AAAAGA at 224, ATAAAA at 222, TTTTAT at 218, TGTTTT at 215, AAAGCA at 186, ATAAAA at 183, ACATTA at 173, AAAACA at 166, ATACAA at 113, AAAGGA at 106, AGAAAA at 103, ATAGAA at 101, AAACAA at 69, AAAACA at 68, AAAAGA at 55, AGAAAA at 53, TTCTTT at 47, TGTTCT at 45, TCTTTT at 26, TTCTTT at 25.


  1. Negative strand, positive direction: TGGTGT at 3969, TGGTGT at 3950, AAAAGA at 3929, ACACCA at 3825, ACATGA at 3708, TCCTGT at 3622, AAAGCA at 3599, ACAGGA at 3572, AGATGA at 3476, ACAGTA at 3414, TTCTTT at 3397, TCTTCT at 3395, TCCTCT at 3304, TGGTCT at 3299, TGGTCT at 3245, ACAGCA at 3212, AGAGCA at 3138, AGAACA at 3094, AAAGAA at 3066, TCTTCT at 3058, TTGTCT at 3053, TTGTCT at 3004, TCCTCT at 2981, TTCTGT at 2957, TGGTCT at 2941, TTCTGT at 2925, ACAGAA at 2838, AAAGGA at 2829, TGGTGT at 2813, AGAGGA at 2793, AGAGCA at 2704, ATATAA at 2662, ACACTA at 2637, AAACCA at 2632, ATAGAA at 2628, ACACCA at 2603, ATACCA at 2591, TTCTTT at 2585, TTTTGT at 2453, TTTTTT at 2451, TAATTT at 2440, TTTTTT at 2281, TCTTTT at 2279, TTCTTT at 2278, TTTTCT at 2276, AGATCA at 2231, TTCTGT at 2182, TATTCT at 2180, TCATAT at 2177, TAGTGT at 2170, TGCTAT at 2157, ACATGA at 2141, TCATCT at 2111, AAAGCA at 2006, TTCTCT at 1990, TTCTTT at 1981, TCGTCT at 1493, TCGTCT at 1393, ACAGCA at 1055, AGAGGA at 471, TCCTCT at 221, AGAGGA at 207, TGTTTT at 147, TCCTGT at 144, TTCTCT at 139, TTTTCT at 137, TTCTCT at 119, TATTCT at 117, TTGTAT at 114, TTCTTT at 110, TGTTCT at 108, TGGTGT at 105, TCGTGT at 80, AGAAGA at 49.
  2. Positive strand, positive direction: ACACCA at 3967, AAACCA at 3948, TTTTCT at 3929, TGGTGT at 3859, TCGTGT at 3740, TCCTCT at 3650, ACACCA at 3643, ACAGGA at 3620, TACTGT at 3569, TGGTCT at 3548, TTATTT at 3427, TCATCT at 3416, AGAAGA at 3395, ACAGAA at 3393, AGACGA at 3307, AGAGGA at 3302, TCGTCT at 3214, TTGTCT at 3179, TGGTTT at 3175, TCCTGT at 3131, TTGTGT at 3096, TCTTGT at 3094, AGAAGA at 3058, AGAGAA at 3056, AGACCA at 3022, AGACGA at 2976, TACTGT at 2843, TTGTGT at 2835, TCCTTT at 2831, TCGTTT at 2706, TTGTCT at 2652, TGGTGT at 2634, TTATCT at 2627, TCCTTT at 2623, TGGTGT at 2600, TAATAT at 2548, TCCTGT at 2460, AAAACA at 2453, AAAAAA at 2451, TGATGT at 2428, TACTGT at 2412, AAATAA at 2347, AAAAAA at 2281, AGAAAA at 2279, AAAGAA at 2277, AAAAGA at 2276, AAAGTA at 2265, AGACAA at 2261, TGGTCT at 2228, AGACAA at 2183, ATAAGA at 2180, AGAGTA at 2175, AGATCA at 2168, TACTTT at 2146, TGGTGT at 2123, AGAGGA at 2081, AAAGAA at 1980, TGCTAT at 1837, AGACGA at 1734, TGGTCT at 1631, AAAGCA at 1182, TCCTCT at 710, ACAAAA at 147, AGAGGA at 142, AAAAGA at 137, ATAAGA at 117, ACATAA at 115, ACAAGA at 108, AGACCA at 103, TACTGT at 62, TCTTCT at 49, TCCTCT at 46.

H boxes (Mitchell) random dataset samplings

  1. HboxMr0: 71, ACAAAA at 4524, AGATTA at 4463, ACATAA at 4450, AGATGA at 4445, ATAAGA at 4442, ACATCA at 4437, AAAACA at 4432, AGACAA at 4235, AAAGTA at 3758, ACAACA at 3729, AGAAAA at 3657, ACAGAA at 3655, ATATAA at 3602, AAAGCA at 3597, ACAATA at 3577, AAAGTA at 3484, AAAGTA at 3340, ACAAAA at 3171, AGAATA at 3150, ATAGGA at 2939, AGACGA at 2908, ATATTA at 2852, AAATCA at 2809, ATAGGA at 2799, AAACGA at 2779, ATAAAA at 2737, AAAATA at 2697, AGAATA at 2677, ACACTA at 2555, AGAGGA at 2529, ATATTA at 2374, ACAAAA at 2333, AGACAA at 2331, ATAATA at 2263, AAATAA at 2261, AAAATA at 2260, AGACAA at 2183, ACACCA at 2165, AAAACA at 2086, ATACTA at 1964, AGAATA at 1961, AGAGAA at 1959, AGAGTA at 1872, AGATCA at 1853, AAATAA at 1790, AAACAA at 1576, AAAGGA at 1530, AAACGA at 1502, ATAAAA at 1499, AAATAA at 1497, ATAATA at 1492, ACAAGA at 1436, AAACCA at 1324, AAACCA at 1244, AGACGA at 1237, AGATGA at 1126, AAACCA at 1088, AAACTA at 833, AGAAAA at 634, AAAGAA at 632, AAAAGA at 593, ACAGCA at 579, AAATTA at 546, ATAGAA at 542, AAACAA at 502, AAACCA at 284, ATAAAA at 199, AAATAA at 197, AAAATA at 196, ATAATA at 146, AAATAA at 144.
  2. RDr1: 0.
  3. RDr2: 0.
  4. RDr3: 0.
  5. RDr4: 0.
  6. RDr5: 0.
  7. RDr6: 0.
  8. RDr7: 0.
  9. RDr8: 0.
  10. RDr9: 0.
  11. RDr0ci: 0.
  12. RDr1ci: 0.
  13. RDr2ci: 0.
  14. RDr3ci: 0.
  15. RDr4ci: 0.
  16. RDr5ci: 0.
  17. RDr6ci: 0.
  18. RDr7ci: 0.
  19. RDr8ci: 0.
  20. RDr9ci: 0.

RDr UTRs

RDr core promoters

RDr proximal promoters

RDr distal promoters

H boxes (Mitchell) analysis and results

They "have the consensus H box sequence (5'-ANANNA-3') but have no other primary sequence identity."[3]

Reals or randoms Promoters direction Numbers Strands Occurrences Averages (± 0.1)
Reals UTR negative 0 2 0 0
Randoms UTR arbitrary negative 0 10 0 0
Randoms UTR alternate negative 0 10 0 0
Reals Core negative 0 2 0 0
Randoms Core negative 0 10 0 0
Reals Core positive 0 2 0 0
Randoms Core positive 0 10 0 0
Reals Proximal negative 0 2 0 0
Randoms Proximal negative 0 10 0 0
Reals Proximal positive 0 2 0 0
Randoms Proximal positive 0 10 0 0
Reals Distal negative 0 2 0 0
Randoms Distal negative 0 10 0 0
Reals Distal positive 0 2 0 0
Randoms Distal positive 0 10 0 0

Comparison:

The occurrences of real responsive element consensus sequences are larger than the randoms. This suggests that the real responsive element consensus sequences are likely active or activable.

H boxes (Rozhdestvensky)

An H box has a consensus sequence of 3'-ACACCA-5'.[4]

H boxes (Rozhdestvensky) in promoters of A1BG

For the Basic programs (starting with SuccessablesHACA.bas) written to compare nucleotide sequences with the sequences on either the template strand (-), or coding strand (+), of the DNA, in the negative direction (-), or the positive direction (+), the programs are, are looking for, and found:

  1. negative strand in the negative direction (from ZSCAN22 to A1BG) is SuccessablesHACA--.bas, looking for 3'-ACACCA-5', 4, 3'-ACACCA-5', 788, 3'-ACACCA-5', 2659, 3'-ACACCA-5', 3187, 3'-ACACCA-5', 3811,
  2. negative strand in the positive direction (from ZNF497 to A1BG) is SuccessablesHACA-+.bas, looking for 3'-ACACCA-5', 1, 3'-ACACCA-5', 386,
  3. positive strand in the negative direction is SuccessablesHACA+-.bas, looking for 3'-ACACCA-5', 2, 3'-ACACCA-5', 883, 3'-ACACCA-5', 2419,
  4. positive strand in the positive direction is SuccessablesHACA++.bas, looking for 3'-ACACCA-5', 2, 3'-ACACCA-5', 204, 3'-ACACCA-5', 528,
  5. complement, negative strand, negative direction is SuccessablesHACAc--.bas, looking for 3'-TGTGGT-5', 2, 3'-TGTGGT-5', 883, 3'-TGTGGT-5', 2419,
  6. complement, negative strand, positive direction is SuccessablesHACAc-+.bas, looking for 3'-TGTGGT-5', 2, 3'-TGTGGT-5', 204, 3'-TGTGGT-5', 528,
  7. complement, positive strand, negative direction is SuccessablesHACAc+-.bas, looking for 3'-TGTGGT-5', 4, 3'-TGTGGT-5', 788, 3'-TGTGGT-5', 2659, 3'-TGTGGT-5', 3187, 3'-TGTGGT-5', 3811,
  8. complement, positive strand, positive direction is SuccessablesHACAc++.bas, looking for 3'-TGTGGT-5', 1, 3'-TGTGGT-5', 386,
  9. inverse complement, negative strand, negative direction is SuccessablesHACAci--.bas, looking for 3'-TGGTGT-5', 1, 3'-TGGTGT-5', 3764,
  10. inverse complement, negative strand, positive direction is SuccessablesHACAci-+.bas, looking for 3'-TGGTGT-5', 2, 3'-TGGTGT-5', 511, 3'-TGGTGT-5', 530,
  11. inverse complement, positive strand, negative direction is SuccessablesHACAci+-.bas, looking for 3'-TGGTGT-5', 3, 3'-TGGTGT-5', 608, 3'-TGGTGT-5', 793, 3'-TGGTGT-5', 1477,
  12. inverse complement, positive strand, positive direction is SuccessablesHACAci++.bas, looking for 3'-TGGTGT-5', 1, 3'-TGGTGT-5', 420,
  13. inverse, negative strand, negative direction, is SuccessablesHACAi--.bas, looking for 3'-ACCACA-5', 3, 3'-ACCACA-5', 608, 3'-ACCACA-5', 793, 3'-ACCACA-5', 1477,
  14. inverse, negative strand, positive direction, is SuccessablesHACAi-+.bas, looking for 3'-ACCACA-5', 1, 3'-ACCACA-5', 420,
  15. inverse, positive strand, negative direction, is SuccessablesHACAi+-.bas, looking for 3'-ACCACA-5', 1, 3'-ACCACA-5', 3764,
  16. inverse, positive strand, positive direction, is SuccessablesHACAi++.bas, looking for 3'-ACCACA-5', 2, 3'-ACCACA-5', 511, 3'-ACCACA-5', 530.

UTRs (Rozhdestvensky)

  1. Negative strand, negative direction: ACACCA at 3811, TGGTGT at 3764, ACACCA at 3187.

Proximal promoters (Rozhdestvensky)

  1. Negative strand, negative direction: ACACCA at 2659.

Distal promoters (Rozhdestvensky)

  1. Negative strand, negative direction: ACACCA at 788.
  2. Positive strand, negative direction: ACACCA at 2419, TGGTGT at 1477, ACACCA at 883, TGGTGT at 793, TGGTGT at 608.


  1. Negative strand, positive direction: TGGTGT at 530, TGGTGT at 511, ACACCA at 386.
  2. Positive strand, positive direction: ACACCA at 528, TGGTGT at 420, ACACCA at 204.

H boxes (Rozhdestvensky) random dataset samplings

  1. RDr0: 0.
  2. RDr1: 0.
  3. RDr2: 0.
  4. RDr3: 0.
  5. RDr4: 0.
  6. RDr5: 0.
  7. RDr6: 0.
  8. RDr7: 0.
  9. RDr8: 0.
  10. RDr9: 0.
  11. RDr0ci: 0.
  12. RDr1ci: 0.
  13. RDr2ci: 0.
  14. RDr3ci: 0.
  15. RDr4ci: 0.
  16. RDr5ci: 0.
  17. RDr6ci: 0.
  18. RDr7ci: 0.
  19. RDr8ci: 0.
  20. RDr9ci: 0.

RDr UTRs

RDr core promoters

RDr proximal promoters

RDr distal promoters

H boxes (Rozhdestvensky) analysis and results

An H box has a consensus sequence of 3'-ACACCA-5'.[4]

Reals or randoms Promoters direction Numbers Strands Occurrences Averages (± 0.1)
Reals UTR negative 0 2 0 0
Randoms UTR arbitrary negative 0 10 0 0
Randoms UTR alternate negative 0 10 0 0
Reals Core negative 0 2 0 0
Randoms Core negative 0 10 0 0
Reals Core positive 0 2 0 0
Randoms Core positive 0 10 0 0
Reals Proximal negative 0 2 0 0
Randoms Proximal negative 0 10 0 0
Reals Proximal positive 0 2 0 0
Randoms Proximal positive 0 10 0 0
Reals Distal negative 0 2 0 0
Randoms Distal negative 0 10 0 0
Reals Distal positive 0 2 0 0
Randoms Distal positive 0 10 0 0

Comparison:

The occurrences of real responsive element consensus sequences are larger than the randoms. This suggests that the real responsive element consensus sequences are likely active or activable.

H-boxes (Grandbastien)

"Two distinct sequence elements, the H-box (consensus CCTACC(N)7CT) and the G-box (CACGTG), are required for stimulation of the chs15 promoter by 4-CA."[5]

H-box consensus sequences

The earlier H-box consensus sequence is CCTACC(N)7CT.[5]

H box in Solanaceae has the following consensus sequence 3'-CC(A/T)ACCNNNNNNN(A/C)T-5'.[6]

H-box (Grandbastien) samplings

Copying a responsive elements consensus sequence CCTACCTGGCGGAT and putting the sequence in "⌘F" finds none between ZNF497 and A1BG or CCTACC finds none between ZSCAN22 and A1BG as can be found by the computer programs.

For the Basic programs testing consensus sequence CC(A/T)ACCNNNNNNN(A/C)T (starting with SuccessablesH-box.bas) written to compare nucleotide sequences with the sequences on either the template strand (-), or coding strand (+), of the DNA, in the negative direction (-), or the positive direction (+), the programs are, are looking for, and found:

  1. negative strand, negative direction, looking for CC(A/T)ACCNNNNNNN(A/C)T, 0.
  2. positive strand, negative direction, looking for CC(A/T)ACCNNNNNNN(A/C)T, 0.
  3. positive strand, positive direction, looking for CC(A/T)ACCNNNNNNN(A/C)T, 0.
  4. negative strand, positive direction, looking for CC(A/T)ACCNNNNNNN(A/C)T, 0.
  5. complement, negative strand, negative direction, looking for GG(A/T)TGGNNNNNNN(G/T)A, 0.
  6. complement, positive strand, negative direction, looking for GG(A/T)TGGNNNNNNN(G/T)A, 0.
  7. complement, positive strand, positive direction, looking for GG(A/T)TGGNNNNNNN(G/T)A, 0.
  8. complement, negative strand, positive direction, looking for GG(A/T)TGGNNNNNNN(G/T)A, 0.
  9. inverse complement, negative strand, negative direction, looking for A(G/T)NNNNNNNGGT(A/T)GG, 0.
  10. inverse complement, positive strand, negative direction, looking for A(G/T)NNNNNNNGGT(A/T)GG, 1, AGAAGTGTTGGTTGG at 3946.
  11. inverse complement, positive strand, positive direction, looking for A(G/T)NNNNNNNGGT(A/T)GG, 0.
  12. inverse complement, negative strand, positive direction, looking for A(G/T)NNNNNNNGGT(A/T)GG, 0.
  13. inverse negative strand, negative direction, looking for T(A/C)NNNNNNNCCA(A/T)CC, 1, TCTTCACAACCAACC at 3946.
  14. inverse positive strand, negative direction, looking for T(A/C)NNNNNNNCCA(A/T)CC, 0.
  15. inverse positive strand, positive direction, looking for T(A/C)NNNNNNNCCA(A/T)CC, 0.
  16. inverse negative strand, positive direction, looking for T(A/C)NNNNNNNCCA(A/T)CC, 0.

H-box (Grandbastien) UTRs

Positive strand, negative direction: AGAAGTGTTGGTTGG at 3946.

H-box (Grandbastien) random dataset samplings

  1. RDr0: 0.
  2. RDr1: 0.
  3. RDr2: 0.
  4. RDr3: 0.
  5. RDr4: 0.
  6. RDr5: 0.
  7. RDr6: 0.
  8. RDr7: 0.
  9. RDr8: 0.
  10. RDr9: 0.
  11. RDr0ci: 0.
  12. RDr1ci: 0.
  13. RDr2ci: 0.
  14. RDr3ci: 0.
  15. RDr4ci: 0.
  16. RDr5ci: 0.
  17. RDr6ci: 0.
  18. RDr7ci: 0.
  19. RDr8ci: 0.
  20. RDr9ci: 0.

RDr UTRs

RDr core promoters

RDr proximal promoters

RDr distal promoters

H-boxes (Grandbastien) analysis and results

H box in Solanaceae has the following consensus sequence 3'-CC(A/T)ACCNNNNNNN(A/C)T-5'.[6]

Reals or randoms Promoters direction Numbers Strands Occurrences Averages (± 0.1)
Reals UTR negative 0 2 0 0
Randoms UTR arbitrary negative 0 10 0 0
Randoms UTR alternate negative 0 10 0 0
Reals Core negative 0 2 0 0
Randoms Core negative 0 10 0 0
Reals Core positive 0 2 0 0
Randoms Core positive 0 10 0 0
Reals Proximal negative 0 2 0 0
Randoms Proximal negative 0 10 0 0
Reals Proximal positive 0 2 0 0
Randoms Proximal positive 0 10 0 0
Reals Distal negative 0 2 0 0
Randoms Distal negative 0 10 0 0
Reals Distal positive 0 2 0 0
Randoms Distal positive 0 10 0 0

Comparison:

The occurrences of real responsive element consensus sequences are larger than the randoms. This suggests that the real responsive element consensus sequences are likely active or activable.

H-box (Lindsay) samplings

Copying a responsive element consensus sequence CCTACC and putting the sequence in "⌘F" finds none between ZNF497 and A1BG or one between ZSCAN22 and A1BG as can be found by the computer programs.

For the Basic programs testing consensus sequence CCTACC (starting with SuccessablesHL-box.bas) written to compare nucleotide sequences with the sequences on either the template strand (-), or coding strand (+), of the DNA, in the negative direction (-), or the positive direction (+), the programs are, are looking for, and found:

  1. negative strand, negative direction, looking for CCTACC, 0.
  2. positive strand, negative direction, looking for CCTACC, 0.
  3. positive strand, positive direction, looking for CCTACC, 1, CCTACC at 1879.
  4. negative strand, positive direction, looking for CCTACC, 1, CCTACC at 1196.
  5. complement, negative strand, negative direction, looking for GGATGG, 0.
  6. complement, positive strand, negative direction, looking for GGATGG, 0.
  7. complement, positive strand, positive direction, looking for GGATGG, 1, GGATGG at 1196.
  8. complement, negative strand, positive direction, looking for GGATGG, 1, GGATGG at 1879.
  9. inverse complement, negative strand, negative direction, looking for GGTAGG, 2, GGTAGG at 4456, GGTAGG at 1838.
  10. inverse complement, positive strand, negative direction, looking for GGTAGG, 1, GGTAGG at 119.
  11. inverse complement, positive strand, positive direction, looking for GGTAGG, 2, GGTAGG at 3629, GGTAGG at 3108.
  12. inverse complement, negative strand, positive direction, looking for GGTAGG, 1, GGTAGG at 3753.
  13. inverse negative strand, negative direction, looking for CCATCC, 1, CCATCC at 119.
  14. inverse positive strand, negative direction, looking for CCATCC, 2, CCATCC at 4456, CCATCC at 1838.
  15. inverse positive strand, positive direction, looking for CCATCC, 1, CCATCC at 3753.
  16. inverse negative strand, positive direction, looking for CCATCC, 2, CCATCC at 3629, CCATCC at 3108.

H-box (Lindsay) UTRs

Negative strand: GGTAGG at 4456.

H-box (Lindsay) proximal promoters

Negative strand: GGTAGG at 1838.

Positive strand: GGTAGG at 119.

H-box (Lindsay) distal promoters

Negative strand: GGTAGG at 3753, CCTACC' at 1196.

Positive strand: GGTAGG at 3629, GGTAGG at 3108, CCTACC at 1879.

H-box (Lindsay) random dataset samplings

  1. RDr0: 0.
  2. RDr1: 0.
  3. RDr2: 0.
  4. RDr3: 0.
  5. RDr4: 0.
  6. RDr5: 0.
  7. RDr6: 0.
  8. RDr7: 0.
  9. RDr8: 0.
  10. RDr9: 0.
  11. RDr0ci: 0.
  12. RDr1ci: 0.
  13. RDr2ci: 0.
  14. RDr3ci: 0.
  15. RDr4ci: 0.
  16. RDr5ci: 0.
  17. RDr6ci: 0.
  18. RDr7ci: 0.
  19. RDr8ci: 0.
  20. RDr9ci: 0.

RDr UTRs

RDr core promoters

RDr proximal promoters

RDr distal promoters

H-boxes (Lindsay) analysis and results

"The KAP-2 protein [...] binds to the H-box (CCTACC) element in the bean CHS15 chalcone synthase promoter".[7]

Reals or randoms Promoters direction Numbers Strands Occurrences Averages (± 0.1)
Reals UTR negative 0 2 0 0
Randoms UTR arbitrary negative 0 10 0 0
Randoms UTR alternate negative 0 10 0 0
Reals Core negative 0 2 0 0
Randoms Core negative 0 10 0 0
Reals Core positive 0 2 0 0
Randoms Core positive 0 10 0 0
Reals Proximal negative 0 2 0 0
Randoms Proximal negative 0 10 0 0
Reals Proximal positive 0 2 0 0
Randoms Proximal positive 0 10 0 0
Reals Distal negative 0 2 0 0
Randoms Distal negative 0 10 0 0
Reals Distal positive 0 2 0 0
Randoms Distal positive 0 10 0 0

Comparison:

The occurrences of real responsive element consensus sequences are larger than the randoms. This suggests that the real responsive element consensus sequences are likely active or activable.

Acknowledgements

The content on this page was first contributed by: Henry A. Hoff.

See also

References

  1. 1.0 1.1 1.2 1.3 1.4 Paul J Rushton and Imre E Somssich (August 1998). "Transcriptional control of plant genes responsive to pathogens" (PDF). Current Opinion in Plant Biology. 1 (4): 311–5. doi:10.1016/1369-5266(88)80052-9. Retrieved 5 November 2018.
  2. Jun Zhong, Antoine H.F.M. Peters, Kathy Kafer, Robert E. Braun (1 June 2001). "A Highly Conserved Sequence Essential for Translational Repression of the Protamine 1 Messenger RNA in Murine Spermatids". Biology of Reproduction. 64 (6): 1784–1789. doi:10.1095/biolreprod64.6.1784. Retrieved 5 November 2018.
  3. 3.0 3.1 3.2 3.3 James R. Mitchell, Jeffrey Cheng, ang Kathleen Collins (January 1999). "A Box H/ACA Small Nucleolar RNA-Like Domain at the Human Telomerase RNA 3' End" (PDF). Molecular and Cellular Biology. 19 (1): 567–576. Retrieved 5 November 2018.
  4. 4.0 4.1 Timofey S. Rozhdestvensky, Thean Hock Tang, Inna V. Tchirkova, Jürgen Brosius, Jean‐Pierre Bachellerie and Alexander Hüttenhofer (2003). "Binding of L7Ae protein to the K‐turn of archaeal snoRNAs: a shared RNA binding motif for C/D and H/ACA box snoRNAs in Archaea". Nucleic Acids Research. 31 (3): 869–77. doi:10.1093/nar/gkg175. Retrieved 2014-06-08.
  5. 5.0 5.1 Gary J. Loake, Ouriel Faktor, Christopher J. Lamb, and Richard A. Dixon (October 1992). "Combination of H-box [CCTACC(N)7CT] and G-box (CACGTG) cis elements is necessary for feed-forward stimulation of a chalcone synthase promoter by the phenylpropanoid-pathway intermediate p-coumaricacid" (PDF). Proceedings of the National Academy of Sciences USA. 89 (19): 9230–9234. doi:10.1073/pnas.89.19.9230. Retrieved 16 March 2021.
  6. 6.0 6.1 M.-A. Grandbastien, C. Audeon, E. Bonnivard, J.M. Casacuberta, B. Chalhoub, A.-P.P. Costa, Q.H. Le, D. Melayah, M. Petit, C. Poncet, S.M. Tam, M.-A. Van Sluys, C. Mhiri (July 2005). "Stress activation and genomic impact of Tnt1 retrotransposons in Solanaceae" (PDF). Cytogenetic and Genomic Research. 110 (1–4): 229–41. doi:10.1159/000084957. Retrieved 5 November 2018.
  7. William P. Lindsay, Fiona M. McAlister, Qun Zhu, Xian-Zhi He, Wolfgang Dröge-Laser, Susie Hedrick, Peter Doerner, Chris Lamb and Richard A. Dixon (July 2002). "KAP-2, a protein that binds to the H-box in a bean chalcone synthase promoter, is a novel plant transcription factor with sequence identity to the large subunit of human Ku autoantigen". Plant Molecular Biology. 49 (5): 503–514. doi:10.1023/A:1015505316379. Retrieved 5 October 2019.

External links