CAT box gene transcriptions

Jump to navigation Jump to search

Associate Editor(s)-in-Chief: Henry A. Hoff

Human genes

The "four cystatin genes [GeneID: 1469 CST1, GeneID: 1470 CST2, GeneID: 1471 CST3, and GeneID: 1472 CST4] contain the ATA-box sequence (ATAAA) in their 5'-flanking regions; however, the CAT-box sequence (CAT), a binding site of the transcription factor, CTF, is found only in the 5'-flanking region of the S-type cystatin genes."[1]

Gene expressions

The "5‘ flanking region of the rat acetylcholine receptor (AChR) β subunit gene [with] regulatory elements that confer muscle specificity [includes] a minimal TATA-box-less promoter region containing an initiator motif. An 85-bp fragment [promotes] high muscle-specific expression of a chloramphenicol acetyltransferase (CAT) reporter construct upon transfection in primary muscle cells. This sequence can be functionally dissected in a basal muscle-specific promoter element carrying a M-CAT box that is flanked at the 5’ end by an enhancer element with two binding sites for myogenic factors. Point mutations in the M-CAT box cause the loss of transcriptional activity of the basal promoter fragment. The enhancer activity depends on the presence of both E boxes that cooperate in a synergistic fashion. [The] control of muscle-specific and developmental expression of the rat AChR β subunit gene requires both regulatory elements, the M-CAT box and two adjacent E boxes, located in close proximity to each other."[2]

Interactions

The "minimal regulatory region of the 5’ flanking sequence contains E box elements that are defined by the nucleotides CANNTG [26, 271. E boxes are shown to provide binding sites for helix-loop-helix proteins of the MyoDl family including MyoDl [28], myogenin [29, 301, MRF4/ herculin [31] and myf5 [32]."[2]

Enhancer activity

"Partial sequence of the 5' flanking region of the rat AChR β subunit gene [contains] putative E box element [CAGGTG], putative Sp1 element [GGGGCGGGT at -85 nts], putative Shue box element [CCCTGGCCTGG at -15 nts], M-CAT box element [GCGGCCTC at -8 nts]."[2]

"Within the first 140bp of the 5’ flanking region the position and sequence of three other putative regulatory elements, the Spl [43, 44], M-CAT [34] and Shue box [45], are conserved between mouse and rat".[2]

Consensus sequences

"The M-CAT consensus sequence [is] CATTCCT".[2]

Promoter occurrences

"A CAT-box-like element, GCCATT [34], adjacent to the GC-box, is conserved in the three promoters."[2]

Hypotheses

  1. A1BG has no CAT boxes in either promoter.
  2. A1BG is not transcribed by a CAT box.
  3. CAT box does not participate in the transcription of A1BG.

CAT box samplings

Copying a CAT box consensus sequence 5'-CATTCCT-3' and putting the sequence in "⌘F" finds one between ZNF497 and A1BG or none between ZSCAN22 and A1BG as can be found by the computer programs.

For the Basic programs testing consensus sequence 5'-CATTCCT-3' (starting with SuccessablesCAT.bas) written to compare nucleotide sequences with the sequences on either the template strand (-), or coding strand (+), of the DNA, in the negative direction (-), or the positive direction (+), the programs are, are looking for, and found:

  1. negative strand, negative direction, looking for 5'-CATTCCT-3', 0.
  2. negative strand, positive direction, looking for 5'-CATTCCT-3', 1, 5'-CATTCCT-3' at 2209, and complement.
  3. positive strand, negative direction, looking for 5'-CATTCCT-3', 0.
  4. positive strand, positive direction, looking for 5'-CATTCCT-3', 1, 5'-CATTCCT-3' at 2458, and complement.
  5. complement, negative strand, negative direction, looking for 5'-GTAAGGA-3', 0.
  6. complement, negative strand, positive direction, looking for 5'-GTAAGGA-3', 1, 5'-GTAAGGA-3' at 2458.
  7. complement, positive strand, negative direction, looking for 5'-GTAAGGA-3', 0.
  8. complement, positive strand, positive direction, looking for 5'-GTAAGGA-3', 1, 5'-GTAAGGA-3' at 2209.
  9. inverse complement, negative strand, negative direction, looking for 5'-AGGAATG-3', 0.
  10. inverse complement, negative strand, positive direction, looking for 5'-AGGAATG-3', 0.
  11. inverse complement, positive strand, negative direction, looking for 5'-AGGAATG-3', 1, 5'-AGGAATG-3' at 4554.
  12. inverse complement, positive strand, positive direction, looking for 5'-AGGAATG-3', 0.
  13. inverse negative strand, negative direction, looking for 5'-TCCTTAC-3', 1, 5'-TCCTTAC-3' at 4554.
  14. inverse negative strand, positive direction, looking for 5'-TCCTTAC-3', 0.
  15. inverse positive strand, negative direction, looking for 5'-TCCTTAC-3', 0.
  16. inverse positive strand, positive direction, looking for 5'-TCCTTAC-3', 0.

CAT (4560-2846) UTRs

  1. Positive strand, negative direction: AGGAATG at 4554.

CAT positive direction (4050-1) distal promoters

  1. Negative strand, positive direction: CATTCCT at 2209.
  2. Positive strand, positive direction: CATTCCT at 2458.

CAT box random dataset samplings

  1. CATr0: 0.
  2. CATr1: 0.
  3. CATr2: 0.
  4. CATr3: 1, CATTCCT at 3089.
  5. CATr4: 1, CATTCCT at 1553.
  6. CATr5: 1, CATTCCT at 985.
  7. CATr6: 0.
  8. CATr7: 0.
  9. CATr8: 0.
  10. CATr9: 0.
  11. CATr0ci: 0.
  12. CATr1ci: 0.
  13. CAT2ci: 1, AGGAATG at 4356.
  14. CATr3ci: 0.
  15. CATr4ci: 1, AGGAATG at 2701.
  16. CATr5ci: 0.
  17. CATr6ci: 0.
  18. CATr7ci: 0.
  19. CATr8ci: 1, AGGAATG at 157.
  20. CATr9ci: 1, AGGAATG at 3677.

CATr arbitrary (evens) (4560-2846) UTRs

  1. CAT2ci: AGGAATG at 4356.

CATr alternate (odds) (4560-2846) UTRs

  1. CATr3: CATTCCT at 3089.
  2. CATr9ci: AGGAATG at 3677.

CATr alternate positive direction (evens) (4445-4265) core promoters

  1. CAT2ci: AGGAATG at 4356.

CATr arbitrary negative direction (evens) (2811-2596) proximal promoters

  1. CATr4ci: AGGAATG at 2701.

CATr arbitrary negative direction (evens) (2596-1) distal promoters

  1. CATr4: CATTCCT at 1553.
  2. CATr8ci: AGGAATG at 157.

CATr alternate negative direction (odds) (2596-1) distal promoters

  1. CATr5: CATTCCT at 985.

CATr arbitrary positive direction (odds) (4050-1) distal promoters

  1. CATr3: CATTCCT at 3089.
  2. CATr5: CATTCCT at 985.
  3. CATr9ci: AGGAATG at 3677.

CATr alternate positive direction (evens) (4050-1) distal promoters

  1. CATr4: CATTCCT at 1553.
  2. CATr8ci: AGGAATG at 157.

CAT box analysis and results

"The M-CAT consensus sequence [is] CATTCCT".[2]

Reals or randoms Promoters direction Numbers Strands Occurrences Averages (± 0.1)
Reals UTR negative 1 2 0.5 0.5
Randoms UTR arbitrary negative 1 10 0.1 0.15
Randoms UTR alternate negative 2 10 0.2 0.15
Reals Core negative 0 2 0 0
Randoms Core arbitrary negative 0 10 0 0
Randoms Core alternate negative 0 10 0 0
Reals Core positive 0 2 0 0
Randoms Core arbitrary positive 0 10 0 0
Randoms Core alternate positive 1 10 0.1 0.05
Reals Proximal negative 0 2 0 0
Randoms Proximal arbitrary negative 1 10 0.1 0.05
Randoms Proximal alternate negative 0 10 0 0.05
Reals Proximal positive 0 2 0 0
Randoms Proximal arbitrary positive 0 10 0 0
Randoms Proximal alternate positive 0 10 0 0
Reals Distal negative 0 2 0 0
Randoms Distal arbitrary negative 2 10 0.2 0.15
Randoms Distal alternate negative 1 10 0.1 0.15
Reals Distal positive 2 2 1 1
Randoms Distal arbitrary positive 3 10 0.3 0.25
Randoms Distal alternate positive 2 10 0.2 0.25

Comparison:

The occurrences of real CATs are greater than the randoms. This suggests that the real CATs are likely active or activable.

CAT-box-like element samplings

Copying a CAT-box-like element consensus sequence GCCATT and putting the sequence in "⌘F" finds none between ZNF497 and A1BG or two between ZSCAN22 and A1BG as can be found by the computer programs.

For the Basic programs testing consensus sequence GCCATT (starting with SuccessablesCATble.bas) written to compare nucleotide sequences with the sequences on either the template strand (-), or coding strand (+), of the DNA, in the negative direction (-), or the positive direction (+), the programs are, are looking for, and found:

  1. negative strand, negative direction, looking for GCCATT, 0.
  2. positive strand, negative direction, looking for GCCATT, 2, GCCATT at 3686, GCCATT at 3284.
  3. positive strand, positive direction, looking for GCCATT, 0.
  4. negative strand, positive direction, looking for GCCATT, 0.
  5. inverse complement, negative strand, negative direction, looking for AATGGC, 0.
  6. inverse complement, positive strand, negative direction, looking for AATGGC, 2, AATGGC at 3005, AATGGC at 1949.
  7. inverse complement, positive strand, positive direction, looking for AATGGC, 0.
  8. inverse complement, negative strand, positive direction, looking for AATGGC, 0.

CATble (4560-2846) UTRs

  1. Positive strand, negative direction: GCCATT at 3686, GCCATT at 3284, AATGGC at 3005.

CATble negative direction (2596-1) distal promoters

  1. Positive strand, negative direction: AATGGC at 1949.

CAT-box-like element random dataset samplings

  1. CATbler0: 3, GCCATT at 4001, GCCATT at 2389, GCCATT at 1184.
  2. CATbler1: 2, GCCATT at 3407, GCCATT at 1908.
  3. CATbler2: 1, GCCATT at 1471.
  4. CATbler3: 2, GCCATT at 1459, GCCATT at 401.
  5. CATbler4: 3, GCCATT at 3574, GCCATT at 3025, GCCATT at 1550.
  6. CATbler5: 3, GCCATT at 2803, GCCATT at 1202, GCCATT at 292.
  7. CATbler6: 3, GCCATT at 3997, GCCATT at 3726, GCCATT at 2292.
  8. CATbler7: 2, GCCATT at 2444, GCCATT at 2270.
  9. CATbler8: 2, GCCATT at 4021, GCCATT at 2461.
  10. CATbler9: 0.
  11. CATbler0ci: 1, AATGGC at 4239.
  12. CATbler1ci: 1, AATGGC at 3940.
  13. CATbler2ci: 1, AATGGC at 429.
  14. CATbler3ci: 2, AATGGC at 3370, AATGGC at 2694.
  15. CATbler4ci: 3, AATGGC at 3265, AATGGC at 1365, AATGGC at 1076.
  16. CATbler5ci: 0.
  17. CATbler6ci: 3, AATGGC at 1958, AATGGC at 251, AATGGC at 74.
  18. CATbler7ci: 1, AATGGC at 3389.
  19. CATbler8ci: 4, AATGGC at 3107, AATGGC at 2423, AATGGC at 640, AATGGC at 159.
  20. CATbler9ci: 4, AATGGC at 4343, AATGGC at 3788, AATGGC at 1724, AATGGC at 333.

CATbler arbitrary (evens) (4560-2846) UTRs

  1. CATbler0: GCCATT at 4001.
  2. CATbler4: GCCATT at 3574, GCCATT at 3025.
  3. CATbler6: GCCATT at 3997, GCCATT at 3726.
  4. CATbler8: GCCATT at 4021.
  5. CATbler0ci: AATGGC at 4239.
  6. CATbler4ci: AATGGC at 3265.
  7. CATbler8ci: AATGGC at 3107.

CATbler alternate (odds) (4560-2846) UTRs

  1. CATbler1: GCCATT at 3407.
  2. CATbler1ci: AATGGC at 3940.
  3. CATbler3ci: AATGGC at 3370.
  4. CATbler7ci: AATGGC at 3389.
  5. CATbler9ci: AATGGC at 4343, AATGGC at 3788.

CATbler arbitrary positive direction (odds) (4445-4265) core promoters

  1. CATbler9ci: AATGGC at 4343.

CATbler alternate negative direction (odds) (2811-2596) proximal promoters

  1. CATbler5: GCCATT at 2803.
  2. CATbler3ci: AATGGC at 2694.

CATbler alternate positive direction (evens) (4265-4050) proximal promoters

  1. CATbler0ci: AATGGC at 4239.

CATbler arbitrary negative direction (evens) (2596-1) distal promoters

  1. CATbler0: GCCATT at 2389, GCCATT at 1184.
  2. CATbler2: GCCATT at 1471.
  3. CATbler4: GCCATT at 1550.
  4. CATbler6: GCCATT at 2292.
  5. CATbler8: GCCATT at 2461.
  6. CATbler2ci: AATGGC at 429.
  7. CATbler4ci: AATGGC at 1365, AATGGC at 1076.
  8. CATbler6ci: AATGGC at 1958, AATGGC at 251, AATGGC at 74.
  9. CATbler8ci: AATGGC at 2423, AATGGC at 640, AATGGC at 159.

CATbler alternate negative direction (odds) (2596-1) distal promoters

  1. CATbler1: GCCATT at 1908.
  2. CATbler3: GCCATT at 1459, GCCATT at 401.
  3. CATbler5: GCCATT at 1202, GCCATT at 292.
  4. CATbler7: GCCATT at 2444, GCCATT at 2270.
  5. CATbler9ci: AATGGC at 1724, AATGGC at 333.

CATbler arbitrary positive direction (odds) (4050-1) distal promoters

  1. CATbler1: GCCATT at 3407, GCCATT at 1908.
  2. CATbler3: GCCATT at 1459, GCCATT at 401.
  3. CATbler5: GCCATT at 2803, GCCATT at 1202, GCCATT at 292.
  4. CATbler7: GCCATT at 2444, GCCATT at 2270.
  5. CATbler1ci: AATGGC at 3940.
  6. CATbler3ci: AATGGC at 3370, AATGGC at 2694.
  7. CATbler7ci: AATGGC at 3389.
  8. CATbler9ci: AATGGC at 3788, AATGGC at 1724, AATGGC at 333.

CATbler alternate positive direction (evens) (4050-1) distal promoters

  1. CATbler0: GCCATT at 4001, GCCATT at 2389, GCCATT at 1184.
  2. CATbler2: GCCATT at 1471.
  3. CATbler4: GCCATT at 3574, GCCATT at 3025, GCCATT at 1550.
  4. CATbler6: GCCATT at 3997, GCCATT at 3726, GCCATT at 2292.
  5. CATbler8: GCCATT at 4021, GCCATT at 2461.
  6. CATbler2ci: AATGGC at 429.
  7. CATbler4ci: AATGGC at 3265, AATGGC at 1365, AATGGC at 1076.
  8. CATbler6ci: AATGGC at 1958, AATGGC at 251, AATGGC at 74.
  9. CATbler8ci: AATGGC at 3107, AATGGC at 2423, AATGGC at 640, AATGGC at 159.

CAT box like analysis and results

"A CAT-box-like element, GCCATT [34], adjacent to the GC-box, is conserved in the three promoters."[2]

Reals or randoms Promoters direction Numbers Strands Occurrences Averages (± 0.1)
Reals UTR negative 3 2 1.5 1.5
Randoms UTR arbitrary negative 9 10 0.9 0.75
Randoms UTR alternate negative 6 10 0.6 0.75
Reals Core negative 0 2 0 0
Randoms Core arbitrary negative 0 10 0 0
Randoms Core alternate negative 0 10 0 0
Reals Core positive 0 2 0 0
Randoms Core arbitrary positive 1 10 0.1 0.05
Randoms Core alternate positive 0 10 0 0.05
Reals Proximal negative 0 2 0 0
Randoms Proximal arbitrary negative 0 10 0 0.1
Randoms Proximal alternate negative 2 10 0.2 0.1
Reals Proximal positive 0 2 0 0
Randoms Proximal arbitrary positive 0 10 0 0.05
Randoms Proximal alternate positive 1 10 0.1 0.05
Reals Distal negative 1 2 0.5 0.5
Randoms Distal arbitrary negative 15 10 1.5 1.2 ± 0.3
Randoms Distal alternate negative 9 10 0.9 1.2 ± 0.3
Reals Distal positive 0 2 0 0
Randoms Distal arbitrary positive 16 10 1.6 1.9 ± 0.3
Randoms Distal alternate positive 22 10 2.2 1.9 ± 0.3

Comparison:

The occurrences of real CATbles are greater or less than the randoms. This suggests that the real CATbles are likely active or activable.

M-CAT box samplings

For the Basic programs testing consensus sequence GCGGCCTC (starting with SuccessablesMCAT.bas) written to compare nucleotide sequences with the sequences on either the template strand (-), or coding strand (+), of the DNA, in the negative direction (-), or the positive direction (+), the programs are, are looking for, and found:

  1. negative strand, negative direction, looking for AAAAAAAA, 0.
  2. positive strand, negative direction, looking for AAAAAAAA, 0.
  3. negative strand, positive direction, looking for AAAAAAAA, 0.
  4. positive strand, positive direction, looking for AAAAAAAA, 0.
  5. inverse complement, negative strand, negative direction, looking for TTTTTTTT, 0.
  6. inverse complement, positive strand, negative direction, looking for TTTTTTTT, 0.
  7. inverse complement, negative strand, positive direction, looking for TTTTTTTT, 0.
  8. inverse complement, positive strand, positive direction, looking for TTTTTTTT, 0.

AAA (4560-2846) UTRs

AAA negative direction (2846-2811) core promoters

AAA positive direction (4445-4265) core promoters

AAA negative direction (2811-2596) proximal promoters

AAA positive direction (4265-4050) proximal promoters

AAA negative direction (2596-1) distal promoters

AAA positive direction (4050-1) distal promoters

Shue box samplings

For the Basic programs testing consensus sequence CCCTGGCCTGG (starting with SuccessablesShue.bas) written to compare nucleotide sequences with the sequences on either the template strand (-), or coding strand (+), of the DNA, in the negative direction (-), or the positive direction (+), the programs are, are looking for, and found:

  1. negative strand, negative direction, looking for AAAAAAAA, 0.
  2. positive strand, negative direction, looking for AAAAAAAA, 0.
  3. negative strand, positive direction, looking for AAAAAAAA, 0.
  4. positive strand, positive direction, looking for AAAAAAAA, 0.
  5. inverse complement, negative strand, negative direction, looking for TTTTTTTT, 0.
  6. inverse complement, positive strand, negative direction, looking for TTTTTTTT, 0.
  7. inverse complement, negative strand, positive direction, looking for TTTTTTTT, 0.
  8. inverse complement, positive strand, positive direction, looking for TTTTTTTT, 0.

AAA (4560-2846) UTRs

AAA negative direction (2846-2811) core promoters

AAA positive direction (4445-4265) core promoters

AAA negative direction (2811-2596) proximal promoters

AAA positive direction (4265-4050) proximal promoters

AAA negative direction (2596-1) distal promoters

AAA positive direction (4050-1) distal promoters

Sp1 element samplings

For the Basic programs testing consensus sequence GGGGCGGGT (starting with SuccessablesSp1B.bas) written to compare nucleotide sequences with the sequences on either the template strand (-), or coding strand (+), of the DNA, in the negative direction (-), or the positive direction (+), the programs are, are looking for, and found:

  1. negative strand, negative direction, looking for AAAAAAAA, 0.
  2. positive strand, negative direction, looking for AAAAAAAA, 0.
  3. negative strand, positive direction, looking for AAAAAAAA, 0.
  4. positive strand, positive direction, looking for AAAAAAAA, 0.
  5. inverse complement, negative strand, negative direction, looking for TTTTTTTT, 0.
  6. inverse complement, positive strand, negative direction, looking for TTTTTTTT, 0.
  7. inverse complement, negative strand, positive direction, looking for TTTTTTTT, 0.
  8. inverse complement, positive strand, positive direction, looking for TTTTTTTT, 0.

AAA (4560-2846) UTRs

AAA negative direction (2846-2811) core promoters

AAA positive direction (4445-4265) core promoters

AAA negative direction (2811-2596) proximal promoters

AAA positive direction (4265-4050) proximal promoters

AAA negative direction (2596-1) distal promoters

AAA positive direction (4050-1) distal promoters

Acknowledgements

The content on this page was first contributed by: Henry A. Hoff.

Initial content for this page in some instances came from Wikiversity.

See also

References

  1. Eiichi Saitoh and Satoko Isemura (January 1, 1993). "Molecular Biology of Human Salivary Cysteine Proteinase Inhibitors" (PDF). Critical Reviews in Oral Biology and Medicine. 4 (3/4): 487–93. doi:10.1177/10454411930040033301. Retrieved 2013-06-28.
  2. 2.0 2.1 2.2 2.3 2.4 2.5 2.6 2.7 Christof Berberich, Ingolf Dürr, Michael Koenen and Veit Witzemann (September 1993). "Two adjacent E box elements and a M‐CAT box are involved in the muscle‐specific regulation of the rat acetylcholine receptor β subunit gene". European Journal of Biochemistry. 216 (2): 395–404. doi:10.1111/j.1432-1033.1993.tb18157.x. Retrieved 27 December 2019.

External links