N box gene transcriptions

Revision as of 15:43, 5 September 2023 by Marshallsumter (talk | contribs) (→‎N box (Leal) analysis and results)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Associate Editor(s)-in-Chief: Henry A. Hoff

"Human pColQ1a carries consensus sequences for transcriptional factors E-protein (E-box, CANNTG), NFAT (GGAAA), c-Ets transcription factor [c-Ets, (C/A)GGA(A/T)], Elk-1, N-box (CCGGAA), and MEF2 (CTAAAAATAA), which play essential roles in muscle-specific and NMJ-specific transcriptional activities (Lee et al., 2004)."[1]

Human genes

Gene expressions

Interactions

Consensus sequences

CCGGAA[1]

Binding site for

"The [basic helix–loop–helix] bHLH proteins from group E were usually bound to the CACGCG [Coupling element] or CACGAG (N-box) motif."[2]

"Group E comprises two families in which the proteins have a conserved Pro or Gly residue within the basic region that mediates preferential binding to the N-box sequences CACGGC or CACGAC."[3]

"The HEY1 gene binds E-box (CANGTG) and N-box (CACNAG) sites (31,32)."[4]

The "putative consensus binding sites of Notch target genes in human IDE promoter" included the N-box from "the first translation start site (ATG)" -3711/-3715 position in a forward (+) orientation with a consensus sequence of CACNAG of the bHLH protein HES-1 with a strong DNA binding activity.[5] For the closer binding position -310/-305 in a reverse (-) orientation of the bHLH protein Hey-1 CACNAG had a weak DNA binding activity.[5] The "Class C" DNA binding site at position -379/-374 in a reverse (-) orientation with a consensus sequence of CACGNG of the bHLH Hey-1 protein had a strong DNA binding activity.[5]

Promoter occurrences

"Transcriptional upregulation of pColQ1a at the [neuromuscular junction] NMJ is mediated by the N-box (CCGGAA; Lee et al., 2004). The N-box (CGGAA) is also the neuregulin-response element in mouse Chrnd encoding the δ subunit of acetylcholine receptor (AChR; Fromm and Burden, 1998) and in rat Chrne encoding the AChR ε subunit (Sapru et al., 1998). Similar involvement of the neuregulin-responsive N-box in the NMJ-specific transcription of ColQ in mouse have also been reported (Lee et al., 2004; Ting et al., 2005). Also, specific activation of pColQ1 and pColQ1a in slow- and fast-twitch muscles, respectively, is mediated by a slow upstream regulatory element (SURE) in pColQ1 and a fast intronic regulatory element (FIRE) in pColQ1a (Lee et al., 2004; Ting et al., 2005)."[1]

Hypotheses

  1. A1BG has no regulatory elements in either promoter.
  2. A1BG is not transcribed by a regulatory element.
  3. No regulatory element participates in the transcription of A1BG.

N-box (Lee) samplings

Copying a responsive elements consensus sequence CCGGAA and putting the sequence in "⌘F" finds none between ZNF497 and A1BG or none between ZSCAN22 and A1BG as can be found by the computer programs.

For the Basic programs testing consensus sequence CCGGAA (starting with SuccessablesN-box.bas) written to compare nucleotide sequences with the sequences on either the template strand (-), or coding strand (+), of the DNA, in the negative direction (-), or the positive direction (+), the programs are, are looking for, and found:

  1. negative strand, negative direction, looking for CCGGAA, 1, CCGGAA at 3569.
  2. positive strand, negative direction, looking for CCGGAA, 0.
  3. positive strand, positive direction, looking for CCGGAA, 10, CCGGAA at 1797, CCGGAA at 1515, CCGGAA at 1431, CCGGAA at 1331, CCGGAA at 1263, CCGGAA at 1011, CCGGAA at 927, CCGGAA at 827, CCGGAA at 675, CCGGAA at 591.
  4. negative strand, positive direction, looking for CCGGAA, 0.
  5. complement, negative strand, negative direction, looking for GGCCTT, 0.
  6. complement, positive strand, negative direction, looking for GGCCTT, 1, GGCCTT at 3569.
  7. complement, positive strand, positive direction, looking for GGCCTT, 0.
  8. complement, negative strand, positive direction, looking for GGCCTT, 10, GGCCTT at 1797, GGCCTT at 1515, GGCCTT at 1431, GGCCTT at 1331, GGCCTT at 1263, GGCCTT at 1011, GGCCTT at 927, GGCCTT at 827, GGCCTT at 675, GGCCTT at 591.
  9. inverse complement, negative strand, negative direction, looking for TTCCGG, 1, TTCCGG at 4167.
  10. inverse complement, positive strand, negative direction, looking for TTCCGG, 0.
  11. inverse complement, positive strand, positive direction, looking for TTCCGG, 10, TTCCGG at 1513, TTCCGG at 1429, TTCCGG at 1329, TTCCGG at 1261, TTCCGG at 1009, TTCCGG at 925, TTCCGG at 825, TTCCGG at 673, TTCCGG at 589, TTCCGG at 212.
  12. inverse complement, negative strand, positive direction, looking for TTCCGG, 1, TTCCGG at 4244.
  13. inverse negative strand, negative direction, looking for AAGGCC, 0.
  14. inverse positive strand, negative direction, looking for AAGGCC, 1, AAGGCC at 4167.
  15. inverse positive strand, positive direction, looking for AAGGCC, 1, AAGGCC at 4244.
  16. inverse negative strand, positive direction, looking for AAGGCC, 10, AAGGCC at 1513, AAGGCC at 1429, AAGGCC at 1329, AAGGCC at 1261, AAGGCC at 1009, AAGGCC at 925, AAGGCC at 825, AAGGCC at 673, AAGGCC at 589, AAGGCC at 212.

N-box (Lee) (4560-2846) UTRs

  1. Negative strand, negative direction: TTCCGG at 4167, CCGGAA at 3569.

N-box (Lee) positive direction (4265-4050) proximal promoters

  1. Negative strand, positive direction: TTCCGG at 4244.

N-box (Lee) positive direction (4050-1) distal promoters

  1. Positive strand, positive direction: CCGGAA at 1797, CCGGAA at 1515, TTCCGG at 1513, CCGGAA at 1431, TTCCGG at 1429, CCGGAA at 1331, TTCCGG at 1329, CCGGAA at 1263, TTCCGG at 1261, CCGGAA at 1011, TTCCGG at 1009, CCGGAA at 927, TTCCGG at 925, CCGGAA at 827, TTCCGG at 825, CCGGAA at 675, TTCCGG at 673, CCGGAA at 591, TTCCGG at 589, TTCCGG at 212.

N box (Lee) random dataset samplings

  1. Leer0: 3, CCGGAA at 3912, CCGGAA at 2751, CCGGAA at 2633.
  2. Leer1: 3, CCGGAA at 2465, CCGGAA at 1420, CCGGAA at 938.
  3. Leer2: 1, CCGGAA at 4127.
  4. Leer3: 2, CCGGAA at 2898, CCGGAA at 760.
  5. Leer4: 0.
  6. Leer5: 2, CCGGAA at 3337, CCGGAA at 3030.
  7. Leer6: 1, CCGGAA at 1191.
  8. Leer7: 2, CCGGAA at 3865, CCGGAA at 2996.
  9. Leer8: 2, CCGGAA at 2126, CCGGAA at 2079.
  10. Leer9: 1, CCGGAA at 3062.
  11. Leer0ci: 0.
  12. Leer1ci: 1, TTCCGG at 817.
  13. Leer2ci: 4, TTCCGG at 3972, TTCCGG at 3397, TTCCGG at 1249, TTCCGG at 648.
  14. Leer3ci: 1, TTCCGG at 3658.
  15. Leer4ci: 0.
  16. Leer5ci: 0.
  17. Leer6ci: 2, TTCCGG at 2818, TTCCGG at 1798.
  18. Leer7ci: 2, TTCCGG at 1827, TTCCGG at 687.
  19. Leer8ci: 2, TTCCGG at 2124, TTCCGG at 119.
  20. Leer9ci: 1, TTCCGG at 3340.

Leer arbitrary (evens) (4560-2846) UTRs

  1. Leer0: CCGGAA at 3912.
  2. Leer2: CCGGAA at 4127.
  3. Leer2ci: TTCCGG at 3972, TTCCGG at 3397.

Leer alternate (odds) (4560-2846) UTRs

  1. Leer3: CCGGAA at 2898.
  2. Leer5: CCGGAA at 3337, CCGGAA at 3030.
  3. Leer7: CCGGAA at 3865, CCGGAA at 2996.
  4. Leer9: CCGGAA at 3062.
  5. Leer3ci: TTCCGG at 3658.
  6. Leer9ci: TTCCGG at 3340.

Leer arbitrary negative direction (evens) (2846-2811) core promoters

  1. Leer6ci: TTCCGG at 2818.

Leer arbitrary negative direction (evens) (2811-2596) proximal promoters

  1. Leer0: CCGGAA at 2751, CCGGAA at 2633.

Leer alternate positive direction (evens) (4265-4050) proximal promoters

  1. Leer2: CCGGAA at 4127.

Leer arbitrary negative direction (evens) (2596-1) distal promoters

  1. Leer8: CCGGAA at 2126, CCGGAA at 2079.
  2. Leer2ci: TTCCGG at 1249, TTCCGG at 648.
  3. Leer6ci: TTCCGG at 1798.
  4. Leer8ci: TTCCGG at 2124, TTCCGG at 119.

Leer alternate negative direction (odds) (2596-1) distal promoters

  1. Leer1: CCGGAA at 2465, CCGGAA at 1420, CCGGAA at 938.
  2. Leer3: CCGGAA at 760.
  3. Leer5: CCGGAA at 3337, CCGGAA at 3030.
  4. Leer1ci: TTCCGG at 817.
  5. Leer7ci: TTCCGG at 1827, TTCCGG at 687.

Leer arbitrary positive direction (odds) (4050-1) distal promoters

  1. Leer1: CCGGAA at 2465, CCGGAA at 1420, CCGGAA at 938.
  2. Leer3: CCGGAA at 2898, CCGGAA at 760.
  3. Leer5: CCGGAA at 3337, CCGGAA at 3030.
  4. Leer7: CCGGAA at 3865, CCGGAA at 2996.
  5. Leer9: CCGGAA at 3062.
  6. Leer1ci: TTCCGG at 817.
  7. Leer3ci: TTCCGG at 3658.
  8. Leer7ci: TTCCGG at 1827, TTCCGG at 687.
  9. Leer9ci: TTCCGG at 3340.

Leer alternate positive direction (evens) (4050-1) distal promoters

  1. Leer0: CCGGAA at 3912, CCGGAA at 2751, CCGGAA at 2633.
  2. Leer8: CCGGAA at 2126, CCGGAA at 2079.
  3. Leer2ci: TTCCGG at 3972, TTCCGG at 3397, TTCCGG at 1249, TTCCGG at 648.
  4. Leer6ci: TTCCGG at 2818, TTCCGG at 1798.
  5. Leer8ci: TTCCGG at 2124, TTCCGG at 119.

N box (Lee) analysis and results

"Human pColQ1a carries consensus sequences for transcriptional factors E-protein (E-box, CANNTG), NFAT (GGAAA), c-Ets transcription factor [c-Ets, (C/A)GGA(A/T)], Elk-1, N-box (CCGGAA), and MEF2 (CTAAAAATAA), which play essential roles in muscle-specific and NMJ-specific transcriptional activities (Lee et al., 2004)."[1]

Reals or randoms Promoters direction Numbers Strands Occurrences Averages (± 0.1)
Reals UTR negative 2 2 1 1 ± 1 (--2,+-0)
Randoms UTR arbitrary negative 4 10 0.4 0.6 ± 0.2
Randoms UTR alternate negative 8 10 0.8 0.6 ± 0.2
Reals Core negative 0 2 0 0
Randoms Core arbitrary negative 1 10 0.1 0.05
Randoms Core alternate negative 0 10 0 0.05
Reals Core positive 0 2 0 0
Randoms Core arbitrary positive 0 10 0 0
Randoms Core alternate positive 0 10 0 0
Reals Proximal negative 0 2 0 0
Randoms Proximal arbitrary negative 2 10 0.2 0.1
Randoms Proximal alternate negative 0 10 0 0.1
Reals Proximal positive 1 2 0.5 0.5 ± 0.5 (-+1,++0)
Randoms Proximal arbitrary positive 0 10 0 0.05
Randoms Proximal alternate positive 1 10 0.1 0.05
Reals Distal negative 0 2 0 0
Randoms Distal arbitrary negative 7 10 0.7 0.8
Randoms Distal alternate negative 9 10 0.9 0.8
Reals Distal positive 20 2 10 10 ± 10 (-+0,++20)
Randoms Distal arbitrary positive 15 10 1.5 1.4
Randoms Distal alternate positive 13 10 1.3 1.4

Comparison:

The occurrences of real N box (Lee) UTRs, proximals and distals are greater than the randoms. This suggests that the real N box (Lee)s are likely active or activable.

N-box (Bai) samplings

Copying a responsive elements consensus sequence CACGAG and putting the sequence in "⌘F" finds one between ZNF497 and A1BG or one between ZSCAN22 and A1BG as can be found by the computer programs.

For the Basic programs testing consensus sequence CACGAG (starting with SuccessablesN-boxB.bas) written to compare nucleotide sequences with the sequences on either the template strand (-), or coding strand (+), of the DNA, in the negative direction (-), or the positive direction (+), the programs are, are looking for, and found:

  1. negative strand, negative direction, looking for CACGAG, 1, CACGAG at 4403.
  2. positive strand, negative direction, looking for CACGAG, 6, CACGAG at 4472, CACGAG at 3232, CACGAG at 1182, CACGAG at 708, CACGAG at 572, CACGAG at 435.
  3. positive strand, positive direction, looking for CACGAG, 2, CACGAG at 3152, CACGAG at 2090.
  4. negative strand, positive direction, looking for CACGAG, 1, CACGAG at 243.
  5. complement, negative strand, negative direction, looking for GTGCTC, 6, GTGCTC at 4472, GTGCTC at 3232, GTGCTC at 1182, GTGCTC at 708, GTGCTC at 572, GTGCTC at 435.
  6. complement, positive strand, negative direction, looking for GTGCTC, 1, GTGCTC at 4403.
  7. complement, positive strand, positive direction, looking for GTGCTC, 1, GTGCTC at 243.
  8. complement, negative strand, positive direction, looking for GTGCTC, 2, GTGCTC at 3152, GTGCTC at 2090.
  9. inverse complement, negative strand, negative direction, looking for CTCGTG, 1, CTCGTG at 3914.
  10. inverse complement, positive strand, negative direction, looking for CTCGTG, 0.
  11. inverse complement, positive strand, positive direction, looking for CTCGTG, 5, CTCGTG at 3739, CTCGTG at 1627, CTCGTG at 1207, CTCGTG at 955, CTCGTG at 855.
  12. inverse complement, negative strand, positive direction, looking for CTCGTG, 1, CTCGTG at 4376.
  13. inverse negative strand, negative direction, looking for GAGCAC, 0.
  14. inverse positive strand, negative direction, looking for GAGCAC, 1, GAGCAC at 3914.
  15. inverse positive strand, positive direction, looking for GAGCAC, 1, GAGCAC at 4376.
  16. inverse negative strand, positive direction, looking for GAGCAC, 5, GAGCAC at 3739, GAGCAC at 1627, GAGCAC at 1207, GAGCAC at 955, GAGCAC at 855.

N-box (Bai) (4560-2846) UTRs

  1. Negative strand, negative direction: CACGAG at 4403, CTCGTG at 3914.
  2. Positive strand, negative direction: CACGAG at 4472, CACGAG at 3232.

N-box (Bai) positive direction (4445-4265) core promoters

  1. Negative strand, positive direction: CTCGTG at 4376.

N-box (Bai) negative direction (2596-1) distal promoters

  1. Positive strand, negative direction: CACGAG at 1182, CACGAG at 708, CACGAG at 572, CACGAG at 435.

N-box (Bai) positive direction (4050-1) distal promoters

  1. Negative strand, positive direction: CACGAG at 243.
  2. Positive strand, positive direction: CTCGTG at 3739, CACGAG at 3152, CACGAG at 2090, CTCGTG at 1627, CTCGTG at 1207, CTCGTG at 955, CTCGTG at 855.

N box (Bai) random dataset samplings

  1. Bair0: 3, CACGAG at 2462, CACGAG at 1070, CACGAG at 443.
  2. Bair1: 1, CACGAG at 3665.
  3. Bair2: 1, CACGAG at 1005.
  4. Bair3: 0.
  5. Bair4: 0.
  6. Bair5: 0.
  7. Bair6: 0.
  8. Bair7: 1, CACGAG at 1241.
  9. Bair8: 1, CACGAG at 2995.
  10. Bair9: 0.
  11. Bair0ci: 0.
  12. Bair1ci: 2, CTCGTG at 2548, CTCGTG at 1377.
  13. Bair2ci: 0.
  14. Bair3ci: 1, CTCGTG at 3114.
  15. Bair4ci: 0.
  16. Bair5ci: 0.
  17. Bair6ci: 0.
  18. Bair7ci: 2, CTCGTG at 3824, CTCGTG at 1007.
  19. Bair8ci: 2, CTCGTG at 3664, CTCGTG at 1212.
  20. Bair9ci: 0.

Bair arbitrary (evens) (4560-2846) UTRs

  1. Bair8: CACGAG at 2995.
  2. Bair8ci: CTCGTG at 3664.

Bair alternate (odds) (4560-2846) UTRs

  1. Bair1: CACGAG at 3665.
  2. Bair3ci: CTCGTG at 3114.
  3. Bair7ci: CTCGTG at 3824.

Bair arbitrary negative direction (evens) (2596-1) distal promoters

  1. Bair0: CACGAG at 2462, CACGAG at 1070, CACGAG at 443.
  2. Bair2: CACGAG at 1005.
  3. Bair8ci: CTCGTG at 1212.

Bair alternate negative direction (odds) (2596-1) distal promoters

  1. Bair7: CACGAG at 1241.
  2. Bair1ci: CTCGTG at 2548, CTCGTG at 1377.
  3. Bair7ci: CTCGTG at 1007.

Bair arbitrary positive direction (odds) (4050-1) distal promoters

  1. Bair1: CACGAG at 3665.
  2. Bair7: CACGAG at 1241.
  3. Bair1ci: CTCGTG at 2548, CTCGTG at 1377.
  4. Bair3ci: CTCGTG at 3114.
  5. Bair7ci: CTCGTG at 3824, CTCGTG at 1007.

Bair alternate positive direction (evens) (4050-1) distal promoters

  1. Bair0: CACGAG at 2462, CACGAG at 1070, CACGAG at 443.
  2. Bair2: CACGAG at 1005.
  3. Bair8: CACGAG at 2995.
  4. Bair8ci: CTCGTG at 3664, CTCGTG at 1212.

N-box (Bai) analysis and results

"The [basic helix–loop–helix] bHLH proteins from group E were usually bound to the CACGCG [Coupling element] or CACGAG (N-box) motif."[2]

Reals or randoms Promoters direction Numbers Strands Occurrences Averages (± 0.1)
Reals UTR negative 4 2 2 2 ± 0 (--2,+-2)
Randoms UTR arbitrary negative 2 10 0.2 0.25
Randoms UTR alternate negative 3 10 0.3 0.25
Reals Core negative 0 2 0 0
Randoms Core arbitrary negative 0 10 0 0
Randoms Core alternate negative 0 10 0 0
Reals Core positive 1 2 0.5 0.5 ± 0.5 (-+1,++0)
Randoms Core arbitrary positive 0 10 0 0
Randoms Core alternate positive 0 10 0 0
Reals Proximal negative 0 2 0 0
Randoms Proximal arbitrary negative 0 10 0 0
Randoms Proximal alternate negative 0 10 0 0
Reals Proximal positive 0 2 0 0
Randoms Proximal arbitrary positive 0 10 0 0
Randoms Proximal alternate positive 0 10 0 0
Reals Distal negative 4 2 2 2 ± 2 (--0,+-4)
Randoms Distal arbitrary negative 5 10 0.5 0.45
Randoms Distal alternate negative 4 10 0.4 0.45
Reals Distal positive 8 2 4 4 ± 3 (-+1,++7)
Randoms Distal arbitrary positive 7 10 0.7 0.7
Randoms Distal alternate positive 7 10 0.7 0.7

Comparison:

The occurrences of real N-box (Bai)s are greater than the randoms. This suggests that the real N-box (Bai)s are likely active or activable.

N-box (Gao) samplings

Copying a responsive elements consensus sequence CACG(A/G)C and putting the sequence in "⌘F" finds three between ZNF497 and A1BG or none between ZSCAN22 and A1BG as can be found by the computer programs.

For the Basic programs testing consensus sequence CACG(A/G)C (starting with SuccessablesAAA.bas) written to compare nucleotide sequences with the sequences on either the template strand (-), or coding strand (+), of the DNA, in the negative direction (-), or the positive direction (+), the programs are, are looking for, and found:

  1. negative strand, negative direction, looking for CACG(A/G)C, 1, CACGAC at 3956.
  2. positive strand, negative direction, looking for CACG(A/G)C, 0.
  3. positive strand, positive direction, looking for CACG(A/G)C, 0.
  4. negative strand, positive direction, looking for CACG(A/G)C, 3, CACGGC at 1699, CACGGC at 980, CACGGC at 880.
  5. complement, negative strand, negative direction, looking for GTGC(C/T)G, 0.
  6. complement, positive strand, negative direction, looking for GTGC(C/T)G, 1, GTGCTG at 3956.
  7. complement, positive strand, positive direction, looking for GTGC(C/T)G, 3, GTGCCG at 1699, GTGCCG at 980, GTGCCG at 880.
  8. complement, negative strand, positive direction, looking for GTGC(C/T)G, 0.
  9. inverse complement, negative strand, negative direction, looking for G(C/T)CGTG, 8, GTCGTG at 3733, GTCGTG at 3072, GTCGTG at 1787, GTCGTG at 1142, GCCGTG at 959, GTCGTG at 678, GTCGTG at 542, GTCGTG at 405.
  10. inverse complement, positive strand, negative direction, looking for G(C/T)CGTG, 0.
  11. inverse complement, positive strand, positive direction, looking for G(C/T)CGTG, 11, GCCGTG at 4005, GTCGTG at 3043, GTCGTG at 2200, GTCGTG at 2104, GCCGTG at 1639, GTCGTG at 1459, GTCGTG at 1359, GTCGTG at 1123, GTCGTG at 1039, GTCGTG at 787, GTCGTG at 619.
  12. inverse complement, negative strand, positive direction, looking for G(C/T)CGTG, 2, GCCGTG at 3812, GTCGTG at 79.
  13. inverse negative strand, negative direction, looking for C(A/G)GCAC, 0.
  14. inverse positive strand, negative direction, looking for C(A/G)GCAC, 8, CAGCAC at 3733, CAGCAC at 3072, CAGCAC at 1787, CAGCAC at 1142, CGGCAC at 959, CAGCAC at 678, CAGCAC at 542, CAGCAC at 405.
  15. inverse positive strand, positive direction, looking for C(A/G)GCAC, 2, CGGCAC at 3812, CAGCAC at 79.
  16. inverse negative strand, positive direction, looking for C(A/G)GCAC, 11, CGGCAC at 4005, CAGCAC at 3043, CAGCAC at 2200, CAGCAC at 2104, CGGCAC at 1639, CAGCAC at 1459, CAGCAC at 1359, CAGCAC at 1123, CAGCAC at 1039, CAGCAC at 787, CAGCAC at 619.

N-box (Gao) (4560-2846) UTRs

  1. Negative strand, negative direction: CACGAC at 3956, GTCGTG at 3733, GTCGTG at 3072.

N-box (Gao) negative direction (2596-1) distal promoters

  1. Negative strand, negative direction: GTCGTG at 1787, GTCGTG at 1142, GCCGTG at 959, GTCGTG at 678, GTCGTG at 542, GTCGTG at 405.

N-box (Gao) positive direction (4050-1) distal promoters

  1. Negative strand, positive direction: GCCGTG at 3812, CACGGC at 1699, CACGGC at 980, CACGGC at 880, GTCGTG at 79.
  2. Positive strand, positive direction: GCCGTG at 4005, GTCGTG at 3043, GTCGTG at 2200, GTCGTG at 2104, GCCGTG at 1639, GTCGTG at 1459, GTCGTG at 1359, GTCGTG at 1123, GTCGTG at 1039, GTCGTG at 787, GTCGTG at 619.

N-box (Gao) random dataset samplings

  1. Gaor0: 2, CACGAC at 3647, CACGGC at 3197.
  2. Gaor1: 3, CACGAC at 3245, CACGAC at 1761, CACGGC at 989.
  3. Gaor2: 1, CACGGC at 547.
  4. Gaor3: 1, CACGGC at 1954.
  5. Gaor4: 0.
  6. Gaor5: 3, CACGGC at 3952, CACGGC at 3743, CACGGC at 567.
  7. Gaor6: 1, CACGAC at 3378.
  8. Gaor7: 2, CACGAC at 1941, CACGAC at 1521.
  9. Gaor8: 2, CACGGC at 4040, CACGGC at 944.
  10. Gaor9: 4, CACGAC at 2147, CACGGC at 770, CACGAC at 382, CACGGC at 216.
  11. Gaor0ci: 2, GCCGTG at 2383, GCCGTG at 1064.
  12. Gaor1ci: 1, GCCGTG at 3500.
  13. Gaor2ci: 0.
  14. Gaor3ci: 0.
  15. Gaor4ci: 0.
  16. Gaor5ci: 1, GCCGTG at 2968.
  17. Gaor6ci: 1, GCCGTG at 335.
  18. Gaor7ci: 5, GTCGTG at 4301, GCCGTG at 4041, GCCGTG at 3131, GCCGTG at 2231, GTCGTG at 1400.
  19. Gaor8ci: 1, GTCGTG at 3845.
  20. Gaor9ci: 1, GTCGTG at 2702.

Gaor arbitrary (evens) (4560-2846) UTRs

  1. Gaor0: CACGAC at 3647, CACGGC at 3197.
  2. Gaor8: CACGGC at 4040.
  3. Gaor8ci: GTCGTG at 3845.

Gaor alternate (odds) (4560-2846) UTRs

  1. Gaor1: CACGAC at 3245.
  2. Gaor5: CACGGC at 3952, CACGGC at 3743.
  3. Gaor1ci: GCCGTG at 3500.
  4. Gaor5ci: GCCGTG at 2968.
  5. Gaor7ci: GTCGTG at 4301, GCCGTG at 4041, GCCGTG at 3131.

Gaor arbitrary positive direction (odds) (4445-4265) core promoters

  1. Gaor7ci: GTCGTG at 4301.

Gaor arbitrary negative direction (evens) (2596-1) distal promoters

  1. Gaor8: CACGGC at 944.
  2. Gaor0ci: GCCGTG at 2383, GCCGTG at 1064.
  3. Gaor6ci: GCCGTG at 335.

Gaor alternate negative direction (odds) (2596-1) distal promoters

  1. Gaor1: CACGAC at 1761, CACGGC at 989.
  2. Gaor3: CACGGC at 1954.
  3. Gaor5: CACGGC at 567.
  4. Gaor7: CACGAC at 1941, CACGAC at 1521.
  5. Gaor9: CACGAC at 2147, CACGGC at 770, CACGAC at 382, CACGGC at 216.
  6. Gaor7ci: GCCGTG at 2231, GTCGTG at 1400.

Gaor arbitrary positive direction (odds) (4050-1) distal promoters

  1. Gaor1: CACGAC at 3245, CACGAC at 1761, CACGGC at 989.
  2. Gaor3: CACGGC at 1954.
  3. Gaor5: CACGGC at 3952, CACGGC at 3743, CACGGC at 567.
  4. Gaor7: CACGAC at 1941, CACGAC at 1521.
  5. Gaor9: CACGAC at 2147, CACGGC at 770, CACGAC at 382, CACGGC at 216.
  6. Gaor1ci: GCCGTG at 3500.
  7. Gaor5ci: GCCGTG at 2968.
  8. Gaor7ci: GCCGTG at 4041, GCCGTG at 3131, GCCGTG at 2231, GTCGTG at 1400.
  9. Gaor9ci: GTCGTG at 2702.

Gaor alternate positive direction (evens) (4050-1) distal promoters

  1. Gaor0: CACGAC at 3647, CACGGC at 3197.
  2. Gaor8: CACGGC at 4040, CACGGC at 944.
  3. Gaor0ci: GCCGTG at 2383, GCCGTG at 1064.
  4. Gaor6ci: GCCGTG at 335.
  5. Gaor8ci: GTCGTG at 3845.

N-box (Gao) analysis and results

"Group E comprises two families in which the proteins have a conserved Pro or Gly residue within the basic region that mediates preferential binding to the N-box sequences CACGGC or CACGAC."[3]

Reals or randoms Promoters direction Numbers Strands Occurrences Averages (± 0.1)
Reals UTR negative 3 2 1.5 1.5 ± 1.5 (--3,+-0)
Randoms UTR arbitrary negative 4 10 0.4 0.6 ± 0.2
Randoms UTR alternate negative 8 10 0.8 0.6 ± 0.2
Reals Core negative 0 2 0 0
Randoms Core arbitrary negative 0 10 0 0
Randoms Core alternate negative 0 10 0 0
Reals Core positive 0 2 0 0
Randoms Core arbitrary positive 1 10 0.1 0
Randoms Core alternate positive 0 10 0 0
Reals Proximal negative 0 2 0 0
Randoms Proximal arbitrary negative 0 10 0 0
Randoms Proximal alternate negative 0 10 0 0
Reals Proximal positive 0 2 0 0
Randoms Proximal arbitrary positive 0 10 0 0
Randoms Proximal alternate positive 0 10 0 0
Reals Distal negative 6 2 3 3 ± 3 (--6,+-0)
Randoms Distal arbitrary negative 4 10 0.4 0.8
Randoms Distal alternate negative 12 10 1.2 0.8
Reals Distal positive 16 2 8 ± 3 (-+5,++11)
Randoms Distal arbitrary positive 19 10 1.9 1.35 ± 0.55
Randoms Distal alternate positive 8 10 0.8 1.35 ± 0.55

Comparison:

The occurrences of real N-box (Gao)s are greater than the randoms. This suggests that the real N-box (Gao)s are likely active or activable.

N-box (Leal) samplings

Copying a responsive elements consensus sequence CACNAG and putting the sequence in "⌘F" finds none between ZNF497 and A1BG or none between ZSCAN22 and A1BG as can be found by the computer programs.

For the Basic programs testing consensus sequence CACNAG (starting with SuccessablesLeal.bas) written to compare nucleotide sequences with the sequences on either the template strand (-), or coding strand (+), of the DNA, in the negative direction (-), or the positive direction (+), the programs are, are looking for, and found:

  1. negative strand, negative direction, looking for CACNAG, 3, CACGAG at 4403, CACCAG at 3812, CACTAG at 1481.
  2. positive strand, negative direction, looking for CACNAG, 10, CACGAG at 4472, CACAAG at 3634, CACTAG at 3493, CACTAG at 3369, CACGAG at 3232, CACAAG at 2244, CACGAG at 1182, CACGAG at 708, CACGAG at 572, CACGAG at 435.
  3. positive strand, positive direction, looking for CACNAG, 7, CACCAG at 4379, CACGAG at 3152, CACCAG at 2940, CACTAG at 2513, CACTAG at 2377, CACGAG at 2090, CACAAG at 107.
  4. negative strand, positive direction, looking for CACNAG, 11, CACCAG at 3719, CACCAG at 2995, CACTAG at 2638, CACCAG at 2604, CACCAG at 1630, CACCAG at 1462, CACCAG at 1362, CACCAG at 1126, CACCAG at 706, CACCAG at 622, CACGAG at 243.
  5. complement, negative strand, negative direction, looking for GTGNTC, 10, GTGCTC at 4472, GTGTTC at 3634, GTGATC at 3493, GTGATC at 3369, GTGCTC at 3232, GTGTTC at 2244, GTGCTC at 1182, GTGCTC at 708, GTGCTC at 572, GTGCTC at 435.
  6. complement, positive strand, negative direction, looking for GTGNTC, 3, GTGCTC at 4403, GTGGTC at 3812, GTGATC at 1481.
  7. complement, positive strand, positive direction, looking for GTGNTC, 11, GTGGTC at 3719, GTGGTC at 2995, GTGATC at 2638, GTGGTC at 2604, GTGGTC at 1630, GTGGTC at 1462, GTGGTC at 1362, GTGGTC at 1126, GTGGTC at 706, GTGGTC at 622, GTGCTC at 243.
  8. complement, negative strand, positive direction, looking for GTGNTC, 7, GTGGTC at 4379, GTGCTC at 3152, GTGGTC at 2940, GTGATC at 2513, GTGATC at 2377, GTGCTC at 2090, GTGTTC at 107.
  9. inverse complement, negative strand, negative direction, looking for CTNGTG, 18, CTAGTG at 4159, CTAGTG at 4008, CTCGTG at 3914, CTGGTG at 3763, CTTGTG at 3669, CTAGTG at 3490, CTAGTG at 3278, CTAGTG at 3099, CTAGTG at 2576, CTAGTG at 2415, CTAGTG at 2241, CTGGTG at 2123, CTAGTG at 1989, CTAGTG at 1169, CTAGTG at 879, CTAGTG at 705, CTAGTG at 527, CTAGTG at 432.
  10. inverse complement, positive strand, negative direction, looking for CTNGTG, 1, CTGGTG at 2328.
  11. inverse complement, positive strand, positive direction, looking for CTNGTG, 7, CTCGTG at 3739, GACCAC at 3716, GAACAC at 3095, GAGCAC at 1627, GAGCAC at 1207, GAGCAC at 955, GAGCAC at 855.
  12. inverse complement, negative strand, positive direction, looking for CTNGTG, 6, CTCGTG at 4376, CTGGTG at 2812, CTAGTG at 2169, CTGGTG at 1142, CTGGTG at 781, CTGGTG at 104.
  13. inverse negative strand, negative direction, looking for GANCAC, 1, GACCAC at 2328.
  14. inverse positive strand, negative direction, looking for GANCAC, 18, GATCAC at 4159, GATCAC at 4008, GAGCAC at 3914, GACCAC at 3763, GAACAC at 3669, GATCAC at 3490, GATCAC at 3278, GATCAC at 3099, GATCAC at 2576, GATCAC at 2415, GATCAC at 2241, GACCAC at 2123, GATCAC at 1989, GATCAC at 1169, GATCAC at 879, GATCAC at 705, GATCAC at 527, GATCAC at 432.
  15. inverse positive strand, positive direction, looking for GANCAC, 6, GAGCAC at 4376, GACCAC at 2812, GATCAC at 2169, GACCAC at 1142, GACCAC at 781, GACCAC at 104.
  16. inverse negative strand, positive direction, looking for GANCAC, 7, GAGCAC at 3739, GACCAC at 3716, GAACAC at 3095, GAGCAC at 1627, GAGCAC at 1207, GAGCAC at 955, GAGCAC at 855.

Leal (4560-2846) UTRs

  1. Negative strand, negative direction: CACGAG at 4403, CTAGTG at 4159, CTAGTG at 4008, CTCGTG at 3914, CACCAG at 3812, CTGGTG at 3763, CTTGTG at 3669, CTAGTG at 3490, CTAGTG at 3278, CTAGTG at 3099.
  2. Positive strand, negative direction: CACGAG at 4472, CACAAG at 3634, CACTAG at 3493, CACTAG at 3369, CACGAG at 3232.

Leal positive direction (4445-4265) core promoters

  1. Negative strand, positive direction: CTCGTG at 4376.
  2. Positive strand, positive direction: CACCAG at 4379.

Leal negative direction (2596-1) distal promoters

  1. Negative strand, negative direction: CTAGTG at 2576, CTAGTG at 2415, CTAGTG at 2241, CTGGTG at 2123, CTAGTG at 1989, CACTAG at 1481, CTAGTG at 1169, CTAGTG at 879, CTAGTG at 705, CTAGTG at 527, CTAGTG at 432.
  2. Positive strand, negative direction: CTGGTG at 2328, CACAAG at 2244, CACGAG at 1182, CACGAG at 708, CACGAG at 572, CACGAG at 435.

Leal positive direction (4050-1) distal promoters

  1. Negative strand, positive direction: CACCAG at 3719, CACCAG at 2995, CTGGTG at 2812, CACTAG at 2638, CACCAG at 2604, CTAGTG at 2169, CACCAG at 1630, CACCAG at 1462, CACCAG at 1362, CTGGTG at 1142, CACCAG at 1126, CTGGTG at 781, CACCAG at 706, CACCAG at 622, CACGAG at 243, CTGGTG at 104.
  2. Positive strand, positive direction: CTCGTG at 3739, GACCAC at 3716, CACGAG at 3152, GAACAC at 3095, CACCAG at 2940, CACTAG at 2513, CACTAG at 2377, CACGAG at 2090, GAGCAC at 1627, GAGCAC at 1207, GAGCAC at 955, GAGCAC at 855, CACAAG at 107.

N box (Leal) random dataset samplings

  1. Lealr0: 3, CACGAG at 2462, CACGAG at 1070, CACGAG at 443.
  2. Lealr1: 3, CACAAG at 4525, CACGAG at 3665, CACTAG at 3260.
  3. Lealr2: 4, CACAAG at 4090, CACAAG at 2028, CACCAG at 1541, CACGAG at 1005.
  4. Lealr3: 4, CACCAG at 3213, CACTAG at 3198, CACCAG at 2251, CACTAG at 565.
  5. Lealr4: 1, CACAAG at 4396.
  6. Lealr5: 4, CACCAG at 3501, CACAAG at 3418, CACTAG at 2641, CACTAG at 1710.
  7. Lealr6: 3, CACCAG at 3910, CACCAG at 3883, CACTAG at 2262.
  8. Lealr7: 5, CACCAG at 4479, CACTAG at 2038, CACCAG at 1872, CACCAG at 1602, CACGAG at 1241.
  9. Lealr8: 4, CACGAG at 2995, CACTAG at 2798, CACTAG at 2768, CACCAG at 2734.
  10. Lealr9: 2, CACCAG at 3188, CACTAG at 650.
  11. Lealr0ci: 1, CTTGTG at 4106.
  12. Lealr1ci: 9, CTGGTG at 3480, CTCGTG at 2548, CTAGTG at 2023, CTGGTG at 1742, CTAGTG at 1579, CTCGTG at 1377, CTAGTG at 968, CTTGTG at 871, CTTGTG at 44.
  13. Lealr2ci: 2, CTTGTG at 3505, CTGGTG at 1109.
  14. Lealr3ci: 5, CTAGTG at 3851, CTCGTG at 3114, CTAGTG at 2439, CTAGTG at 1571, CTGGTG at 422.
  15. Lealr4ci: 4, CTAGTG at 3420, CTGGTG at 3324, CTGGTG at 2758, CTGGTG at 1294.
  16. Lealr5ci: 5, CTTGTG at 2854, CTAGTG at 2576, CTAGTG at 2423, CTAGTG at 2204, CTAGTG at 1214.
  17. Lealr6ci: 2, CTTGTG at 1745, CTGGTG at 1572.
  18. Lealr7ci: 6, CTAGTG at 4286, CTCGTG at 3824, CTGGTG at 3396, CTGGTG at 3074, CTAGTG at 2655, CTCGTG at 1007.
  19. Lealr8ci: 4, CTCGTG at 3664, CTAGTG at 1564, CTCGTG at 1212, CTAGTG at 752.
  20. Lealr9ci: 3, CTAGTG at 4469, CTGGTG at 2788, CTGGTG at 2202.

Lealr arbitrary (evens) (4560-2846) UTRs

  1. Lealr2: CACAAG at 4090.
  2. Lealr4: CACAAG at 4396.
  3. Lealr0ci: CTTGTG at 4106.
  4. Lealr2ci: CTTGTG at 3505.
  5. Lealr4ci: CTAGTG at 3420, CTGGTG at 3324.
  6. Lealr8ci: CTCGTG at 3664.

Lealr alternate (odds) (4560-2846) UTRs

  1. Lealr1: CACAAG at 4525, CACGAG at 3665, CACTAG at 3260.
  2. Lealr3: CACCAG at 3213, CACTAG at 3198.
  3. Lealr5: CACCAG at 3501, CACAAG at 3418.
  4. Lealr7: CACCAG at 4479.
  5. Lealr9: CACCAG at 3188.
  6. Lealr1ci: CTGGTG at 3480.
  7. Lealr3ci: CTAGTG at 3851, CTCGTG at 3114.
  8. Lealr5ci: CTTGTG at 2854.
  9. Lealr7ci: CTAGTG at 4286, CTCGTG at 3824, CTGGTG at 3396, CTGGTG at 3074.
  10. Lealr9ci: CTAGTG at 4469.

Lealr arbitrary positive direction (odds) (4445-4265) core promoters

  1. Lealr7ci: CTAGTG at 4286.

Lealr alternate positive direction (evens) (4445-4265) core promoters

  1. Lealr4: CACAAG at 4396.

Lealr arbitrary negative direction (evens) (2811-2596) proximal promoters

  1. Lealr4ci: CTGGTG at 2758.

Lealr alternate negative direction (odds) (2811-2596) proximal promoters

  1. Lealr5: CACTAG at 2641.
  2. Lealr7ci: CTAGTG at 2655.
  3. Lealr9ci: CTGGTG at 2788.

Lealr alternate positive direction (evens) (4265-4050) proximal promoters

  1. Lealr2: CACAAG at 4090.
  2. Lealr0ci: CTTGTG at 4106.

Lealr arbitrary negative direction (evens) (2596-1) distal promoters

  1. Lealr0: CACGAG at 2462, CACGAG at 1070, CACGAG at 443.
  2. Lealr2: CACAAG at 2028, CACCAG at 1541, CACGAG at 1005.
  3. Lealr2ci: CTGGTG at 1109.
  4. Lealr4ci: CTGGTG at 1294.
  5. Lealr6ci: CTTGTG at 1745, CTGGTG at 1572.
  6. Lealr8ci: CTAGTG at 1564, CTCGTG at 1212, CTAGTG at 752.

Lealr alternate negative direction (odds) (2596-1) distal promoters

  1. Lealr3: CACCAG at 2251, CACTAG at 565.
  2. Lealr5: CACTAG at 1710.
  3. Lealr7: CACTAG at 2038, CACCAG at 1872, CACCAG at 1602, CACGAG at 1241.
  4. Lealr9: CACTAG at 650.
  5. Lealr1ci: CTCGTG at 2548, CTAGTG at 2023, CTGGTG at 1742, CTAGTG at 1579, CTCGTG at 1377, CTAGTG at 968, CTTGTG at 871, CTTGTG at 44.
  6. Lealr3ci: CTAGTG at 2439, CTAGTG at 1571, CTGGTG at 422.
  7. Lealr5ci: CTAGTG at 2576, CTAGTG at 2423, CTAGTG at 2204, CTAGTG at 1214.
  8. Lealr7ci: CTCGTG at 1007.
  9. Lealr9ci: CTGGTG at 2202.

Lealr arbitrary positive direction (odds) (4050-1) distal promoters

  1. Lealr1: CACGAG at 3665, CACTAG at 3260.
  2. Lealr3: CACCAG at 3213, CACTAG at 3198, CACCAG at 2251, CACTAG at 565.
  3. Lealr5: CACCAG at 3501, CACAAG at 3418, CACTAG at 2641, CACTAG at 1710.
  4. Lealr7: CACTAG at 2038, CACCAG at 1872, CACCAG at 1602, CACGAG at 1241.
  5. Lealr9: CACCAG at 3188, CACTAG at 650.
  6. Lealr1ci: CTGGTG at 3480, CTCGTG at 2548, CTAGTG at 2023, CTGGTG at 1742, CTAGTG at 1579, CTCGTG at 1377, CTAGTG at 968, CTTGTG at 871, CTTGTG at 44.
  7. Lealr3ci: CTAGTG at 3851, CTCGTG at 3114, CTAGTG at 2439, CTAGTG at 1571, CTGGTG at 422.
  8. Lealr5ci: CTTGTG at 2854, CTAGTG at 2576, CTAGTG at 2423, CTAGTG at 2204, CTAGTG at 1214.
  9. Lealr7ci: CTCGTG at 3824, CTGGTG at 3396, CTGGTG at 3074, CTAGTG at 2655, CTCGTG at 1007.
  10. Lealr9ci: CTGGTG at 2788, CTGGTG at 2202.

Lealr alternate positive direction (evens) (4050-1) distal promoters

  1. Lealr0: CACGAG at 2462, CACGAG at 1070, CACGAG at 443.
  2. Lealr2: CACAAG at 2028, CACCAG at 1541, CACGAG at 1005.
  3. Lealr2ci: CTTGTG at 3505, CTGGTG at 1109.
  4. Lealr4ci: CTAGTG at 3420, CTGGTG at 3324, CTGGTG at 2758, CTGGTG at 1294.
  5. Lealr6ci: CTTGTG at 1745, CTGGTG at 1572.
  6. Lealr8ci: CTCGTG at 3664, CTAGTG at 1564, CTCGTG at 1212, CTAGTG at 752.

N box (Leal) analysis and results

A consensus sequence of CACNAG of the bHLH protein HES-1 with a strong DNA binding activity.[5]

Reals or randoms Promoters direction Numbers Strands Occurrences Averages (± 0.1)
Reals UTR negative 15 2 7.5 7.5 ± 2.5 (--10,+-5)
Randoms UTR arbitrary negative 7 10 0.7 1.25 ± 0.55
Randoms UTR alternate negative 18 10 1.8 1.25 ± 0.55
Reals Core negative 0 2 0 0
Randoms Core arbitrary negative 0 10 0 0
Randoms Core alternate negative 0 10 0 0
Reals Core positive 2 2 1 1 ± 0 (-+1,++1)
Randoms Core arbitrary positive 1 10 0.1 0.1
Randoms Core alternate positive 1 10 0.1 0.1
Reals Proximal negative 0 2 0 0
Randoms Proximal arbitrary negative 1 10 0.1 0.2
Randoms Proximal alternate negative 3 10 0.3 0.2
Reals Proximal positive 0 2 0 0
Randoms Proximal arbitrary positive 0 10 0 0.1
Randoms Proximal alternate positive 2 10 0.2 0.1
Reals Distal negative 17 2 8.5 8.5 ± 2.5 (--11,+-6)
Randoms Distal arbitrary negative 13 10 1.3 1.9 ± 0.6
Randoms Distal alternate negative 25 10 2.5 1.9 ± 0.6
Reals Distal positive 29 2 14.5 14.5 ± 1.5(-+16,++13)
Randoms Distal arbitrary positive 42 10 4.2 3.0 ± 1.2
Randoms Distal alternate positive 18 10 1.8 3.0 ± 1.2

Comparison:

The occurrences of real N-box (Leal) are greater than the randoms. This suggests that the real N-box (Leal)s are likely active or activable.

"Class C" (Leal) samplings

Copying a responsive elements consensus sequence CACGNG and putting the sequence in "⌘F" finds none between ZNF497 and A1BG or none between ZSCAN22 and A1BG as can be found by the computer programs.

For the Basic programs testing consensus sequence CACGNG (starting with SuccessablesClassC.bas) written to compare nucleotide sequences with the sequences on either the template strand (-), or coding strand (+), of the DNA, in the negative direction (-), or the positive direction (+), the programs are, are looking for, and found:

  1. negative strand, negative direction, looking for CACGNG, 2, CACGAG at 4403, CACGGG at 3882.
  2. positive strand, negative direction, looking for CACGNG, 6, CACGAG at 4472, CACGAG at 3232, CACGAG at 1182, CACGAG at 708, CACGAG at 572, CACGAG at 435.
  3. positive strand, positive direction, looking for CACGNG, 14, CACGGG at 4275, CACGTG at 3884, CACGGG at 3236, CACGAG at 3152, CACGGG at 3012, CACGTG at 2961, CACGAG at 2090, CACGCG at 1726, CACGCG at 1522, CACGCG at 1245, CACGTG at 1219, CACGCG at 1161, CACGGG at 573, CACGTG at 547.
  4. negative strand, positive direction, looking for CACGNG, 9, CACGGG at 4260, CACGGG at 3749, CACGGG at 1642, CACGGG at 1560, CACGGG at 1224, CACGCG at 970, CACGCG at 870, CACGTG at 570, CACGAG at 243.
  5. complement, negative strand, negative direction, looking for GTGCNC, 6, GTGCTC at 4472, GTGCTC at 3232, GTGCTC at 1182, GTGCTC at 708, GTGCTC at 572, GTGCTC at 435.
  6. complement, positive strand, negative direction, looking for GTGGNC, 2, GTGCTC at 4403, GTGCCC at 3882.
  7. complement, positive strand, positive direction, looking for GTGGNC, 9, GTGCCC at 4260, GTGCCC at 3749, GTGCCC at 1642, GTGCCC at 1560, GTGCCC at 1224, GTGCGC at 970, GTGCGC at 870, GTGCAC at 570, GTGCTC at 243.
  8. complement, negative strand, positive direction, looking for GTGCNC, 14, GTGCCC at 4275, GTGCAC at 3884, GTGCCC at 3236, GTGCTC at 3152, GTGCCC at 3012, GTGCAC at 2961, GTGCTC at 2090, GTGCGC at 1726, GTGCGC at 1522, GTGCGC at 1245, GTGCAC at 1219, GTGCGC at 1161, GTGCCC at 573, GTGCAC at 547.
  9. inverse complement, negative strand, negative direction, looking for CNCGTG, 3, CTCGTG at 3914, CCCGTG at 1115, CCCGTG at 517.
  10. inverse complement, positive strand, negative direction, looking for CNCGTG, 1, CCCGTG at 1219.
  11. inverse complement, positive strand, positive direction, looking for CNCGTG, 16, CACGTG at 3884, CTCGTG at 3739, CACGTG at 2961, CTCGTG at 1627, CGCGTG at 1551, CGCGTG at 1299, CACGTG at 1219, CTCGTG at 1207, CGCGTG at 1131, CGCGTG at 1047, CGCGTG at 977, CTCGTG at 955, CGCGTG at 877, CTCGTG at 855, CGCGTG at 685, CACGTG at 547.
  12. inverse complement, negative strand, positive direction, looking for CNCGTG, 4, CTCGTG at 4376, CGCGTG at 1216, CACGTG at 570, CGCGTG at 544.
  13. inverse negative strand, negative direction, looking for GNGCAC, 1, GGGCAC at 1219.
  14. inverse positive strand, negative direction, looking for GNGCAC, 3, GAGCAC at 3914, GGGCAC at 1115, GGGCAC at 517.
  15. inverse positive strand, positive direction, looking for GNGCAC, 4, GAGCAC at 4376, GCGCAC at 1216, GTGCAC at 570, GCGCAC at 544.
  16. inverse negative strand, positive direction, looking for GNGCAC, 16, GTGCAC at 3884, GAGCAC at 3739, GTGCAC at 2961, GAGCAC at 1627, GCGCAC at 1551, GCGCAC at 1299, GTGCAC at 1219, GAGCAC at 1207, GCGCAC at 1131, GCGCAC at 1047, GCGCAC at 977, GAGCAC at 955, GCGCAC at 877, GAGCAC at 855, GCGCAC at 685, GTGCAC at 547.

ClassC (4560-2846) UTRs

  1. Negative strand, negative direction: CACGAG at 4403, CTCGTG at 3914, CACGGG at 3882.
  2. Positive strand, negative direction: CACGAG at 4472, CACGAG at 3232.

ClassC positive direction (4445-4265) core promoters

  1. Negative strand, positive direction: CTCGTG at 4376.
  2. Positive strand, positive direction: CACGGG at 4275.

ClassC positive direction (4265-4050) proximal promoters

  1. Negative strand, positive direction: CACGGG at 4260.

ClassC negative direction (2596-1) distal promoters

  1. Negative strand, negative directio: CCCGTG at 1115, CCCGTG at 517.
  2. Positive strand, negative direction: CCCGTG at 1219, CACGAG at 1182, CACGAG at 708, CACGAG at 572, CACGAG at 435.

ClassC positive direction (4050-1) distal promoters

  1. Negative strand, positive direction: CACGGG at 3749, CACGGG at 1642, CACGGG at 1560, CACGGG at 1224, CGCGTG at 1216, CACGCG at 970, CACGCG at 870, CACGTG at 570, CGCGTG at 544, CACGAG at 243.
  2. Positive strand, positive direction: CACGTG at 3884, CTCGTG at 3739, CACGGG at 3236, CACGAG at 3152, CACGGG at 3012, CACGTG at 2961, CACGAG at 2090, CACGCG at 1726, CTCGTG at 1627, CGCGTG at 1551, CACGCG at 1522, CGCGTG at 1299, CACGCG at 1245, CACGTG at 1219, CTCGTG at 1207, CACGCG at 1161, CGCGTG at 1131, CGCGTG at 1047, CGCGTG at 977, CTCGTG at 955, CGCGTG at 877, CTCGTG at 855, CGCGTG at 685, CACGGG at 573, CACGTG at 547.

"Class C" random dataset samplings

  1. ClassCr0: 6, CACGTG at 4343, CACGGG at 4112, CACGGG at 3703, CACGAG at 2462, CACGAG at 1070, CACGAG at 443.
  2. ClassCr1: 3, CACGGG at 3913, CACGAG at 3665, CACGGG at 3233.
  3. ClassCr2: 2, CACGCG at 1930, CACGAG at 1005.
  4. ClassCr3: 3, CACGTG at 3769, CACGCG at 587, CACGGG at 385.
  5. ClassCr4: 3, CACGGG at 3160, CACGTG at 2287, CACGGG at 1285.
  6. ClassCr5: 4, CACGGG at 2928, CACGCG at 2700, CACGGG at 803, CACGTG at 59.
  7. ClassCr6: 4, CACGGG at 4460, CACGTG at 2905, CACGGG at 2210, CACGTG at 654.
  8. ClassCr7: 3, CACGGG at 3252, CACGTG at 1856, CACGAG at 1241.
  9. ClassCr8: 2, CACGAG at 2995, CACGCG at 15.
  10. ClassCr9: 2, CACGCG at 2117, CACGTG at 1187.
  11. ClassCr0ci: 1, CACGTG at 4343.
  12. ClassCr1ci: 3, CCCGTG at 4418, CTCGTG at 2548, CTCGTG at 1377.
  13. ClassCr2ci: 3, CCCGTG at 3634, CGCGTG at 3087, CGCGTG at 591.
  14. ClassCr3ci: 3, CACGTG at 3769, CTCGTG at 3114, CCCGTG at 2315.
  15. ClassCr4ci: 2, CACGTG at 2287, CCCGTG at 815.
  16. ClassCr5ci: 6, CCCGTG at 4496, CGCGTG at 4314, CCCGTG at 874, CCCGTG at 730, CCCGTG at 124, CACGTG at 59.
  17. ClassCr6ci: 3, CCCGTG at 4145, CACGTG at 2905, CACGTG at 654.
  18. ClassCr7ci: 5, CTCGTG at 3824, CCCGTG at 3665, CGCGTG at 3561, CACGTG at 1856, CTCGTG at 1007.
  19. ClassCr8ci: 5, CCCGTG at 4086, CCCGTG at 4053, CTCGTG at 3664, CTCGTG at 1212, CCCGTG at 183.
  20. ClassCr9ci: 5, CCCGTG at 3273, CCCGTG at 2942, CCCGTG at 2454, CACGTG at 1187, CCCGTG at 1120.

ClassCr arbitrary (evens) (4560-2846) UTRs

  1. ClassCr0: CACGTG at 4343, CACGGG at 4112, CACGGG at 3703.
  2. ClassCr4: CACGGG at 3160.
  3. ClassCr6: CACGGG at 4460, CACGTG at 2905.
  4. ClassCr8: CACGAG at 2995.
  5. ClassCr0ci: CACGTG at 4343.
  6. ClassCr2ci: CCCGTG at 3634, CGCGTG at 3087.
  7. ClassCr6ci: CCCGTG at 4145, CACGTG at 2905.
  8. ClassCr8ci: CCCGTG at 4086, CCCGTG at 4053, CTCGTG at 3664.

ClassCr alternate (odds) (4560-2846) UTRs

  1. ClassCr1: CACGGG at 3913, CACGAG at 3665, CACGGG at 3233.
  2. ClassCr3: CACGTG at 3769.
  3. ClassCr5: CACGGG at 2928.
  4. ClassCr7: CACGGG at 3252.
  5. ClassCr1ci: CCCGTG at 4418.
  6. ClassCr3ci: CACGTG at 3769, CTCGTG at 3114.
  7. ClassCr5ci: CCCGTG at 4496, CGCGTG at 4314.
  8. ClassCr7ci: CTCGTG at 3824, CCCGTG at 3665, CGCGTG at 3561.
  9. ClassCr9ci: CCCGTG at 3273, CCCGTG at 2942.

ClassCr arbitrary positive direction (odds) (4445-4265) core promoters

  1. ClassCr1ci: CCCGTG at 4418.
  2. ClassCr5ci: CGCGTG at 4314.

ClassCr alternate positive direction (evens) (4445-4265) core promoters

  1. ClassCr0: CACGTG at 4343.
  2. ClassCr0ci: CACGTG at 4343.

ClassCr alternate negative direction (odds) (2811-2596) proximal promoters

  1. ClassCr5: CACGCG at 2700.

ClassCr alternate positive direction (evens) (4265-4050) proximal promoters

  1. ClassCr0: CACGGG at 4112.
  2. ClassCr6ci: CCCGTG at 4145.
  3. ClassCr8ci: CCCGTG at 4086, CCCGTG at 4053.

ClassCr arbitrary negative direction (evens) (2596-1) distal promoters

  1. ClassCr0:CACGAG at 2462, CACGAG at 1070, CACGAG at 443.
  2. ClassCr2: CACGCG at 1930, CACGAG at 1005.
  3. ClassCr4: CACGTG at 2287, CACGGG at 1285.
  4. ClassCr6: CACGGG at 2210, CACGTG at 654.
  5. ClassCr8: CACGCG at 15.
  6. ClassCr2ci: CGCGTG at 591.
  7. ClassCr4ci: CACGTG at 2287, CCCGTG at 815.
  8. ClassCr6ci: CACGTG at 654.
  9. ClassCr8ci: CTCGTG at 1212, CCCGTG at 183.

ClassicCr alternate negative direction (odds) (2596-1) distal promoters

  1. ClassCr3: CACGCG at 587, CACGGG at 385.
  2. ClassCr5: CACGGG at 803, CACGTG at 59.
  3. ClassCr7: CACGTG at 1856, CACGAG at 1241.
  4. ClassCr9: CACGCG at 2117, CACGTG at 1187.
  5. ClassCr1ci: CTCGTG at 2548, CTCGTG at 1377.
  6. ClassCr3ci: CCCGTG at 2315.
  7. ClassCr5ci: CCCGTG at 874, CCCGTG at 730, CCCGTG at 124, CACGTG at 59.
  8. ClassCr7ci: CACGTG at 1856, CTCGTG at 1007.
  9. ClassCr9ci: CCCGTG at 2454, CACGTG at 1187, CCCGTG at 1120.

ClassCr arbitrary positive direction (odds) (4050-1) distal promoters

  1. ClassCr1: CACGGG at 3913, CACGAG at 3665, CACGGG at 3233.
  2. ClassCr3: CACGTG at 3769, CACGCG at 587, CACGGG at 385.
  3. ClassCr5: CACGGG at 2928, CACGCG at 2700, CACGGG at 803, CACGTG at 59.
  4. ClassCr7: CACGGG at 3252, CACGTG at 1856, CACGAG at 1241.
  5. ClassCr9: CACGCG at 2117, CACGTG at 1187.
  6. ClassCr1ci: CTCGTG at 2548, CTCGTG at 1377.
  7. ClassCr3ci: CACGTG at 3769, CTCGTG at 3114, CCCGTG at 2315.
  8. ClassCr5ci: CCCGTG at 874, CCCGTG at 730, CCCGTG at 124, CACGTG at 59.
  9. ClassCr7ci: CTCGTG at 3824, CCCGTG at 3665, CGCGTG at 3561, CACGTG at 1856, CTCGTG at 1007.
  10. ClassCr9ci: CCCGTG at 3273, CCCGTG at 2942, CCCGTG at 2454, CACGTG at 1187, CCCGTG at 1120.

ClassCr alternate positive direction (evens) (4050-1) distal promoters

  1. ClassCr0: CACGGG at 3703, CACGAG at 2462, CACGAG at 1070, CACGAG at 443.
  2. ClassCr2: CACGCG at 1930, CACGAG at 1005.
  3. ClassCr4: CACGGG at 3160, CACGTG at 2287, CACGGG at 1285.
  4. ClassCr6: CACGTG at 2905, CACGGG at 2210, CACGTG at 654.
  5. ClassCr8: CACGAG at 2995, CACGCG at 15.
  6. ClassCr2ci: CCCGTG at 3634, CGCGTG at 3087, CGCGTG at 591.
  7. ClassCr4ci: CACGTG at 2287, CCCGTG at 815.
  8. ClassCr6ci: 3CACGTG at 2905, CACGTG at 654.
  9. ClassCr8ci: 5CTCGTG at 3664, CTCGTG at 1212, CCCGTG at 183.

"Class C" analysis and results

The "Class C" DNA binding site at position -379/-374 in a reverse (-) orientation with a consensus sequence of CACGNG of the bHLH Hey-1 protein had a strong DNA binding activity.[5]

Reals or randoms Promoters direction Numbers Strands Occurrences Averages (± 0.1)
Reals UTR negative 5 2 2.5 2.5 ± 0.5 (--3,+-2)
Randoms UTR arbitrary negative 15 10 1.5 1.55
Randoms UTR alternate negative 16 10 1.6 1.55
Reals Core negative 0 2 0 0
Randoms Core arbitrary negative 0 10 0 0
Randoms Core alternate negative 0 10 0 0
Reals Core positive 2 2 1 1 ± 0 (-+1,++1)
Randoms Core arbitrary positive 2 10 0.2 0.2
Randoms Core alternate positive 2 10 0.2 0.2
Reals Proximal negative 0 2 0 0
Randoms Proximal arbitrary negative 0 10 0 0.05
Randoms Proximal alternate negative 1 10 0.1 0.05
Reals Proximal positive 1 2 0.5 0.5 ± 0.5 (-+1,++0)
Randoms Proximal arbitrary positive 0 10 0 0.2
Randoms Proximal alternate positive 4 10 0.4 0.2
Reals Distal negative 7 2 3.5 3.5 ± 1.5 (--2,+-5)
Randoms Distal arbitrary negative 16 10 1.6 1.8
Randoms Distal alternate negative 20 10 2.0 1.8
Reals Distal positive 35 2 17.5 17.5 ± 7.5 (-+10,++25)
Randoms Distal arbitrary positive 34 10 3.4 2.9 ± 0.5
Randoms Distal alternate positive 24 10 2.4 2.9 ± 0.5

Comparison:

The negative distals occurrence (--2) is the same as the highest random negative distals. The occurrences of real Class Cs are generally greater than the randoms. This suggests that the real Class Cs are likely active or activable.

Acknowledgements

The content on this page was first contributed by: Henry A. Hoff.

See also

References

  1. 1.0 1.1 1.2 1.3 Kun Huang, Jin Li, Mikako Ito, Jun-Ichi Takeda, Bisei Ohkawara, Tomoo Ogi, Akio Masuda, and Kinji Ohno (8 September 2020). "Gene Expression Profile at the Motor Endplate of the Neuromuscular Junction of Fast-Twitch Muscle". Frontiers in Molecular Neuroscience. 13: 154–165. doi:10.3389/fnmol.2020.00154. PMID 33117128 Check |pmid= value (help). Retrieved 19 March 2021.
  2. 2.0 2.1 Ge Bai, Da-Hai Yang, Peijian Chao, Heng Yao, MingLiang Fei, Yihan Zhang, Xuejun Chen, Bingguang Xiao, Feng Li, Zhen-Yu Wang, Jun Yang and He Xie (19 December 2020). "Genome-wide identification and expression analysis of NtbHLH gene family in tobacco (Nicotiana tabacum L.) and the role of NtbHLH86 in drought adaptation". Plant Diversity. 10: 4. doi:10.1016/j.pld.2020.10.004. Retrieved 19 March 2021.
  3. 3.0 3.1 Min Gao, Yanxun Zhu, Jinhua Yang, Hongjing Zhang, Chenxia Cheng, Yucheng Zhang, Ran Wan, Zhangjun Fei & Xiping Wang (18 March 2019). "Identification of the grape basic helix–loop–helix transcription factor family and characterization of expression patterns in response to different stresses". Plant Growth Regulation. 88: 19–39. Retrieved 20 March 2021.
  4. Takahito Fukusumi, Theresa W. Guo, Shuling Ren, Sunny Haft, Chao Liu, Akihiro Sakai, Mizuo Ando, Yuki Saito, Sayed Sadat, and Joseph A. Califano (February 2021). "Reciprocal activation of HEY1 and NOTCH4 under SOX2 control promotes EMT in head and neck squamous cell carcinoma". International Journal of Oncology. 58 (2): 226–237. doi:10.3892/ijo.2020.5156. PMID 33491747 Check |pmid= value (help). Retrieved 21 March 2021.
  5. 5.0 5.1 5.2 5.3 5.4 María C. Leal, Ezequiel I. Surace, María P. Holgado, Carina C. Ferrari, Rodolfo Tarelli, Fernando Pitossi, Thomas Wisniewski, Eduardo M. Castaño, and Laura Morelli (19 October 2011). "Notch signaling proteins HES-1 and Hey-1 bind to insulin degrading enzyme (IDE) proximal promoter and repress its transcription and activity: Implications for cellular Aβ metabolism". Biochim Biophys Acta. 1823 (2): 227–235. doi:10.1016/j.bbamcr.2011.09.014. PMID 22036964. Retrieved 21 March 2021.

External links