X box gene transcriptions

Jump to navigation Jump to search

Editor-In-Chief: Henry A. Hoff

"The so-called X (or X1) box in the promoter of the human MHC [major histocompatibility complex] class II DRA gene is the binding site for a ubiquitous mammalian sequence-specific DNA-binding protein called RFX, NF-X, NF-Xc, or RFX1 (4,19,23,24,27)."[1]

"RFX is MDBP [methylated DNA binding protein,] the MDBP (RFX) recognition site region in the DRA promoter can be considered to extend from positions -100 to -112 [...] a possible binding site for MDBP which begins 88 bp after the first residue of the presumably full-length RFX1 (MDBP) cDNA (26). This site (RFX+88) is as follows: 5'-GTTGGCATGGCAAC-3'."[1]

X-box motifs

"Based on sequence variability in the X box region (position -188 to -152), three different sequence motifs can be distinguished (X-A, X-B and X-C). Identical bases are marked by a dash; the region of the X1 box is underlined; the region corresponding to the X2 box is given in italics."[2]

X2 box is AGGTCCA.[2]

X-A box is AAAAAAAA//TCTGCCCAGAGACAGATGAGGTCCA, where TG is missing at //, which contains X1 = CCCAGAGACAGATGA and disrupts the palindrome TGTCNNNNNNNNGACA.[2]

X-B box is AAAAAAAATGTCTGCCTAGAGACAGATTAGGTCCA which contains X1 = CCTAGAGACAGATTA and the palindrome CCTANNNNNNNNNTAGG.[2]

X-C box is AAAAAAAATGTCTGCCTAGAGACAGATGAGGTCCA which contains X1 = TGCCTAGAGAC and the palindrome CCTANNNNNNNNNTAGG.[2]

Alternate X1boxes contain TCTGCC or AGAGACAGAT. So test X1box alternate 1 (X1abox) TCTGCC and X1box alternate 2 (X1bbox) AGAGACAGAT.

Consensus sequences

"In order to define a candidate gene set of direct DAF‐19 targets, we first searched the X‐box consensus GTTNCCATGGNAAC from Swoboda et al. (2000), GTYNCYATRGYAAC from Blacque et al. (2005), and GTHNYYATRRNAAC from Efimenko et al. (2005) in the P. pacificus Hybrid1 assembly."[3] Y = (C/T), R = (A/G), N = (A, C, G, T), H = (A/C/T).

GTTNCCATGGNAAC, GT(C/T)NC(C/T)AT(A/G)G(C/T)AAC, GT(A/C/T)N(C/T)(C/T)AT(A/G)(A/G)NAAC generalizes to GT(A/C/T)N(C/T)(C/T)AT(A/G)(A/G)NAAC.

Xbox (Zhang) samplings

For the Basic programs (starting with SuccessablesXbox.bas) written to compare nucleotide sequences with the sequences on either the template strand (-), or coding strand (+), of the DNA, in the negative direction (-), or the positive direction (+), including extending the number of nts from 958 to 4445, the programs are, are looking for, and found:

  1. negative strand in the negative direction (from ZSCAN22 to A1BG) is SuccessablesXbox--.bas, looking for 3'-GTTGGCATGGCAAC-5', 0,
  2. negative strand in the positive direction (from ZNF497 to A1BG) is SuccessablesXbox-+.bas, looking for 3'-GTTGGCATGGCAAC-5', 0,
  3. positive strand in the negative direction is SuccessablesXbox+-.bas, looking for 3'-GTTGGCATGGCAAC-5', 0,
  4. positive strand in the positive direction is SuccessablesXbox++.bas, looking for 3'-GTTGGCATGGCAAC-5', 0,
  5. complement, negative strand, negative direction is SuccessablesXboxc--.bas, looking for 3'-CAACCGTACCGTTG-5', 0,
  6. complement, negative strand, positive direction is SuccessablesXboxc-+.bas, looking for 3'-CAACCGTACCGTTG-5', 0,
  7. complement, positive strand, negative direction is SuccessablesXboxc+-.bas, looking for 3'-CAACCGTACCGTTG-5', 0,
  8. complement, positive strand, positive direction is SuccessablesXboxc++.bas, looking for 3'-CAACCGTACCGTTG-5', 0,
  9. inverse complement, negative strand, negative direction is SuccessablesXboxci--.bas, looking for 3'-GTTGCCATGCCAAC-5', 0,
  10. inverse complement, negative strand, positive direction is SuccessablesXboxci-+.bas, looking for 3'-GTTGCCATGCCAAC-5', 0,
  11. inverse complement, positive strand, negative direction is SuccessablesXboxci+-.bas, looking for 3'-GTTGCCATGCCAAC-5', 0,
  12. inverse complement, positive strand, positive direction is SuccessablesXboxci++.bas, looking for 3'-GTTGCCATGCCAAC-5', 0,
  13. inverse, negative strand, negative direction, is SuccessablesXboxi--.bas, looking for 3'-CAACGGTACGGTTG-5', 0,
  14. inverse, negative strand, positive direction, is SuccessablesXboxi-+.bas, looking for 3'-CAACGGTACGGTTG-5', 0,
  15. inverse, positive strand, negative direction, is SuccessablesXboxi+-.bas, looking for 3'-CAACGGTACGGTTG-5', 0,
  16. inverse, positive strand, positive direction, is SuccessablesXboxi++.bas, looking for 3'-CAACGGTACGGTTG-5', 0.

X-box (Moreno) samplings

Copying a responsive elements consensus sequence GTCCTCAT and putting the sequence in "⌘F" finds none between ZNF497 and A1BG or none between ZSCAN22 and A1BG as can be found by the computer programs.

For the Basic programs testing consensus sequence GT(A/C/T)N(C/T)(C/T)AT(A/G)(A/G)NAAC (starting with SuccessablesX-box.bas) written to compare nucleotide sequences with the sequences on either the template strand (-), or coding strand (+), of the DNA, in the negative direction (-), or the positive direction (+), the programs are, are looking for, and found:

  1. negative strand, negative direction, looking for GT(A/C/T)N(C/T)(C/T)AT(A/G)(A/G)NAAC, 0.
  2. positive strand, negative direction, looking for GT(A/C/T)N(C/T)(C/T)AT(A/G)(A/G)NAAC, 0.
  3. positive strand, positive direction, looking for GT(A/C/T)N(C/T)(C/T)AT(A/G)(A/G)NAAC, 0.
  4. negative strand, positive direction, looking for GT(A/C/T)N(C/T)(C/T)AT(A/G)(A/G)NAAC, 0.
  5. complement, negative strand, negative direction, looking for CA(A/G/T)N(A/G)(A/G)TA(C/T)(C/T)NTTG, 0.
  6. complement, positive strand, negative direction, looking for CA(A/G/T)N(A/G)(A/G)TA(C/T)(C/T)NTTG, 0.
  7. complement, positive strand, positive direction, looking for CA(A/G/T)N(A/G)(A/G)TA(C/T)(C/T)NTTG, 0.
  8. complement, negative strand, positive direction, looking for CA(A/G/T)N(A/G)(A/G)TA(C/T)(C/T)NTTG, 0.
  9. inverse complement, negative strand, negative direction, looking for GTTN(C/T)(C/T)AT(A/G)(A/G)N(A/G/T)AC, 0.
  10. inverse complement, positive strand, negative direction, looking for GTTN(C/T)(C/T)AT(A/G)(A/G)N(A/G/T)AC, 0.
  11. inverse complement, positive strand, positive direction, looking for GTTN(C/T)(C/T)AT(A/G)(A/G)N(A/G/T)AC, 0.
  12. inverse complement, negative strand, positive direction, looking for GTTN(C/T)(C/T)AT(A/G)(A/G)N(A/G/T)AC, 0.
  13. inverse negative strand, negative direction, looking for CAAN(A/G)(A/G)TA(C/T)(C/T)N(A/C/T)TG, 0.
  14. inverse positive strand, negative direction, looking for CAAN(A/G)(A/G)TA(C/T)(C/T)N(A/C/T)TG, 0.
  15. inverse positive strand, positive direction, looking for CAAN(A/G)(A/G)TA(C/T)(C/T)N(A/C/T)TG, 0.
  16. inverse negative strand, positive direction, looking for CAAN(A/G)(A/G)TA(C/T)(C/T)N(A/C/T)TG, 0.

X2 box samplings

For the Basic programs testing consensus sequence AGGTCCA (starting with SuccessablesX2box.bas) written to compare nucleotide sequences with the sequences on either the template strand (-), or coding strand (+), of the DNA, in the negative direction (-), or the positive direction (+), the programs are, are looking for, and found:

  1. Negative strand, negative direction: 0.
  2. Positive strand, negative direction: 0.
  3. Negative strand, positive direction: 0.
  4. Positive strand, positive direction: 2, AGGTCCA at 4033, AGGTCCA at 3112.
  5. inverse complement, negative strand, negative direction: 0.
  6. inverse complement, positive strand, negative direction: 0.
  7. inverse complement, negative strand, positive direction: 0.
  8. inverse complement, positive strand, positive direction: 1, TGGACCT at 41.

X2box positive direction (4050-1) distal promoters

  1. Positive strand, positive direction: AGGTCCA at 4033, AGGTCCA at 3112.
  2. Positive strand, positive direction: TGGACCT at 41.

X2box random dataset samplings

  1. X2boxr0: 0.
  2. X2boxr1: 1, AGGTCCA at 2561.
  3. X2boxr2: 0.
  4. X2boxr3: 0.
  5. X2boxr4: 0.
  6. X2boxr5: 0.
  7. X2boxr6: 0.
  8. X2boxr7: 0.
  9. X2boxr8: 1, AGGTCCA at 1418.
  10. X2boxr9: 1, AGGTCCA at 576.
  11. X2boxr0ci: 0.
  12. X2boxr1ci: 0.
  13. X2boxr2ci: 0.
  14. X2boxr3ci: 1, TGGACCT at 76.
  15. X2boxr4ci: 0.
  16. X2boxr5ci: 1, TGGACCT at 4345.
  17. X2boxr6ci: 1, TGGACCT at 2016.
  18. X2boxr7ci: 0.
  19. X2boxr8ci: 0.
  20. X2boxr9ci: 0.

X2boxr alternate (odds) (4560-2846) UTRs

  1. X2boxr5ci: TGGACCT at 4345.

X2boxr arbitrary positive direction (odds) (4445-4265) core promoters

  1. X2boxr5ci: TGGACCT at 4345.

X2boxr arbitrary negative direction (evens) (2596-1) distal promoters

  1. X2boxr8: AGGTCCA at 1418.

X2boxr alternate negative direction (odds) (2596-1) distal promoters

  1. X2boxr1: AGGTCCA at 2561.
  2. X2boxr9: AGGTCCA at 576.
  3. X2boxr3ci: TGGACCT at 76.

X2boxr arbitrary positive direction (odds) (4050-1) distal promoters

  1. X2boxr1: AGGTCCA at 2561.
  2. X2boxr9: AGGTCCA at 576.
  3. X2boxr3ci: TGGACCT at 76.

X2boxr alternate positive direction (evens) (4050-1) distal promoters

  1. X2boxr8: AGGTCCA at 1418.

X2box analysis and results

X2 box is AGGTCCA.[2]

Reals or randoms Promoters direction Numbers Strands Occurrences Averages (± 0.1)
Reals UTR negative 0 2 0 0
Randoms UTR arbitrary negative 0 10 0 0.05
Randoms UTR alternate negative 1 10 0.1 0.05
Reals Core negative 0 2 0 0
Randoms Core arbitrary negative 0 10 0 0
Randoms Core alternate negative 0 10 0 0
Reals Core positive 0 2 0 0
Randoms Core arbitrary positive 1 10 0.1 0.05
Randoms Core alternate positive 0 10 0 0.05
Reals Proximal negative 0 2 0 0
Randoms Proximal arbitrary negative 0 10 0 0
Randoms Proximal alternate negative 0 10 0 0
Reals Proximal positive 0 2 0 0
Randoms Proximal arbitrary positive 0 10 0 0
Randoms Proximal alternate positive 0 10 0 0
Reals Distal negative 0 2 0 0
Randoms Distal arbitrary negative 1 10 0.1 0.2
Randoms Distal alternate negative 3 10 0.3 0.2
Reals Distal positive 3 2 1.5 1.5 ± 1.5 (-+0,++3)
Randoms Distal arbitrary positive 3 10 0.3 0.2 ± 0.1
Randoms Distal alternate positive 1 10 0.1 0.2 ± 0.1

Comparison:

The occurrences of real X2boxes are greater than the randoms. This suggests that the real X2boxes are likely active or activable.

X1abox samplings

Copying a responsive elements consensus sequence TCTGCC and putting the sequence in "⌘F" finds one between ZNF497 and A1BG or none between ZSCAN22 and A1BG as can be found by the computer programs.

For the Basic programs testing consensus sequence TCTGCC (starting with SuccessablesX1abox.bas) written to compare nucleotide sequences with the sequences on either the template strand (-), or coding strand (+), of the DNA, in the negative direction (-), or the positive direction (+), the programs are, are looking for, and found:

  1. Negative strand, negative direction: 0.
  2. Positive strand, negative direction: 0.
  3. Negative strand, positive direction: 1, TCTGCC at 3359.
  4. Positive strand, positive direction: 1, TCTGCC at 399.
  5. inverse complement, negative strand, negative direction: 1, GGCAGA at 754.
  6. inverse complement, positive strand, negative direction: 4, GGCAGA at 3589, GGCAGA at 1967, GGCAGA at 1314, GGCAGA at 1023.
  7. inverse complement, negative strand, positive direction: 1, GGCAGA at 3473.
  8. inverse complement, positive strand, positive direction: 0.

X1abox (4560-2846) UTRs

  1. Positive strand, negative direction: GGCAGA at 3589.

X1abox negative direction (2596-1) distal promoters

  1. Negative strand, negative direction: GGCAGA at 754.
  2. Positive strand, negative direction: GGCAGA at 1967, GGCAGA at 1314, GGCAGA at 1023.

X1abox positive direction (4050-1) distal promoters

  1. Negative strand, positive direction: TCTGCC at 3359.
  2. Negative strand, positive direction: GGCAGA at 3473.
  3. Positive strand, positive direction: TCTGCC at 399.

X1abox random dataset samplings

  1. X1aboxr0: 1, TCTGCC at 4075.
  2. X1aboxr1: 0.
  3. X1aboxr2: 3, TCTGCC at 3346, TCTGCC at 3102, TCTGCC at 205.
  4. X1aboxr3: 1, TCTGCC at 29.
  5. X1aboxr4: 1, TCTGCC at 966.
  6. X1aboxr5: 0.
  7. X1aboxr6: 1, TCTGCC at 1883.
  8. X1aboxr7: 1, TCTGCC at 4038.
  9. X1aboxr8: 0.
  10. X1aboxr9: 2, TCTGCC at 2756, TCTGCC at 730.
  11. X1aboxr0ci: 0.
  12. X1aboxr1ci: 0.
  13. X1aboxr2ci: 3, GGCAGA at 4549, GGCAGA at 4443, GGCAGA at 334.
  14. X1aboxr3ci: 0.
  15. X1aboxr4ci: 0.
  16. X1aboxr5ci: 3, GGCAGA at 2593, GGCAGA at 1427, GGCAGA at 688.
  17. X1aboxr6ci: 1, GGCAGA at 2323.
  18. X1aboxr7ci: 1, GGCAGA at 1265.
  19. X1aboxr8ci: 1, GGCAGA at 3539.
  20. X1aboxr9ci: 0.

X1aboxr arbitrary (evens) (4560-2846) UTRs

  1. X1aboxr0: TCTGCC at 4075.
  2. X1aboxr2: TCTGCC at 3346, TCTGCC at 3102.
  3. X1aboxr2ci: GGCAGA at 4549, GGCAGA at 4443.

X1aboxr alternate (odds) (4560-2846) UTRs

  1. X1aboxr7: TCTGCC at 4038.

X1aboxr alternate positive direction (evens) (4445-4265) core promoters

  1. X1aboxr2ci: GGCAGA at 4443.

X1aboxr alternate negative direction (odds) (2811-2596) proximal promoters

  1. X1aboxr9: TCTGCC at 2756.

X1aboxr alternate positive direction (evens) (4265-4050) proximal promoters

  1. X1aboxr0: TCTGCC at 4075.

X1aboxr arbitrary negative direction (evens) (2596-1) distal promoters

  1. X1aboxr4: TCTGCC at 966.
  2. X1aboxr2ci: GGCAGA at 334.
  3. X1aboxr6ci: GGCAGA at 2323.

X1aboxr alternate negative direction (odds) (2596-1) distal promoters

  1. X1aboxr3: TCTGCC at 29.
  2. X1aboxr9: TCTGCC at 730.
  3. X1aboxr5ci: GGCAGA at 2593, GGCAGA at 1427, GGCAGA at 688.

X1aboxr arbitrary positive direction (odds) (4050-1) distal promoters

  1. X1aboxr3: TCTGCC at 29.
  2. X1aboxr7: TCTGCC at 4038.
  3. X1aboxr9: TCTGCC at 2756, TCTGCC at 730.
  4. X1aboxr5ci: GGCAGA at 2593, GGCAGA at 1427, GGCAGA at 688.

X1aboxr alternate positive direction (evens) (4050-1) distal promoters

  1. X1aboxr2: TCTGCC at 3346, TCTGCC at 3102, TCTGCC at 205.
  2. X1aboxr4: TCTGCC at 966.
  3. X1aboxr2ci: GGCAGA at 334.
  4. X1aboxr6ci: GGCAGA at 2323.

X1abox analysis and results

X-C box is AAAAAAAATGTCTGCCTAGAGACAGATGAGGTCCA.[2]

Reals or randoms Promoters direction Numbers Strands Occurrences Averages (± 0.1)
Reals UTR negative 1 2 0.5 0.5 ± 0.5 (+-1,--0)
Randoms UTR arbitrary negative 5 10 0.5 0.3 ± 0.2
Randoms UTR alternate negative 1 10 0.1 0.3 ± 0.2
Reals Core negative 0 2 0 0
Randoms Core arbitrary negative 0 10 0 0
Randoms Core alternate negative 1 10 0 0
Reals Core positive 0 2 0 0
Randoms Core arbitrary positive 0 10 0 0.05
Randoms Core alternate positive 1 10 0.1 0.05
Reals Proximal negative 0 2 0 0
Randoms Proximal arbitrary negative 0 10 0 0.05
Randoms Proximal alternate negative 1 10 0.1 0.05
Reals Proximal positive 0 2 0 0
Randoms Proximal arbitrary positive 0 10 0 0.05
Randoms Proximal alternate positive 1 10 0.1 0.05
Reals Distal negative 4 2 2 2 ± 1 (--1,+-3)
Randoms Distal arbitrary negative 3 10 0.3 0.4 ± 0.1
Randoms Distal alternate negative 5 10 0.5 0.4 ± 0.1
Reals Distal positive 3 2 1.5 1.5 ± 0.5 (-+2,++1)
Randoms Distal arbitrary positive 7 10 0.7 0.65 ± 0.05
Randoms Distal alternate positive 6 10 0.6 0.65 ± 0.05

Comparison:

The occurrences of real X1abox UTRs and distals are greater than the randoms. This suggests that the real X1aboxes are likely active or activable.

X1bbox samplings

Copying a responsive elements consensus sequence AGAGACAGAT and putting the sequence in "⌘F" finds none between ZNF497 and A1BG or none between ZSCAN22 and A1BG as can be found by the computer programs.

For the Basic programs testing consensus sequence AGAGACAGAT (starting with SuccessablesX1bbox.bas) written to compare nucleotide sequences with the sequences on either the template strand (-), or coding strand (+), of the DNA, in the negative direction (-), or the positive direction (+), the programs are, are looking for, and found:

  1. Negative strand, negative direction: 0.
  2. Positive strand, negative direction: 0.
  3. Negative strand, positive direction: 0.
  4. Positive strand, positive direction: 0.
  5. inverse complement, negative strand, negative direction: 0.
  6. inverse complement, positive strand, negative direction: 0.
  7. inverse complement, negative strand, positive direction: 0.
  8. inverse complement, positive strand, positive direction: 0.

Acknowledgements

The content on this page was first contributed by: Henry A. Hoff.

See also

References

  1. 1.0 1.1 XIAN-YANG ZHANG, NABILA JABRANE-FERRAT, CLEMENT K. ASIEDU, SANJA SAMAC, B. MATIJA PETERLIN, AND MELANIE EHRLICH (November 1993). "The Major Histocompatibility Complex Class II Promoter-Binding Protein RFX (NF-X) Is a Methylated DNA-Binding Protein" (PDF). MOLECULAR AND CELLULAR BIOLOGY. 13 (11): 6810–8. Retrieved 2017-04-05.
  2. 2.0 2.1 2.2 2.3 2.4 2.5 2.6 B. Ferstl, T. Zacher, B. Lauer, N. Blagitko-Dorfs, A. Carl and R. Wassmuth (2004). "Allele-specific quantification of HLA-DQB1 gene expression by real-time reverse transcriptase-polymerase chain reaction" (PDF). Genes and Immunity. 5: 405–416. doi:10.1038/sj.gene.6364108. Retrieved 23 November 2018.
  3. Eduardo Moreno, Maša Lenuzzi, Christian Rödelsperger, Neel Prabh, Hanh Witte, Waltraud Roeseler, Metta Riebesell, Ralf J. Sommer (November/December 2018). "DAF‐19/RFX controls ciliogenesis and influences oxygen‐induced social behaviors in Pristionchus pacificus". Evolution & Development. 20 (6): 233–243. doi:10.1111/ede.12271. Retrieved 9 March 2021. Check date values in: |date= (help)

External links