TATA box gene transcriptions

Jump to navigation Jump to search

Editor-In-Chief: Henry A. Hoff

File:Haloquadratum walsbyi00.jpg
This image is a drawing of Haloquadratum walsbyi. Credit: Rotational.

The TATA box (also called Goldberg-Hogness box)[1] is a DNA sequence (cis-regulatory element) found in the promoter region of genes in archaea and eukaryotes;[2] approximately 24% of human genes contain a TATA box within the core promoter.[3]

The TATA box is a binding site of either general transcription factors or histones.

Consensus sequences

In the direction of transcription along the DNA strand, the TATA box has the core DNA sequence 3'-TATAAA-5' or a variant, which is usually followed by three or more adenine (A) bases, specifically [3'-TATAAA(A)AAA-5' on the template strand].

"[M]ost of the diversity within metazoan core promoters appears to involve the variable occurrence of consensus or near-consensus TATA, Inr, and DPE elements."[4]

The TATA box can be an AT-rich sequence "located at a fixed distance upstream of the transcription start site"[2].

Histones

The binding of a transcription factor blocks the binding of a histone and vice versa.

Gene expressions

Although it is harder to regulate the transcription of genes with multiple transcription start sites, "variations in the expression of a constitutive gene would be minimized by the use of multiple start sites."[5]

Earlier "studies led to the design of a super core promoter (SCP) that contains a TATA, Inr, MTE, and DPE in a single promoter (Juven-Gershon et al., 2006b). The SCP is the strongest core promoter observed in vitro and in cultured cells and yields high levels of transcription in conjunction with transcriptional enhancers. These findings indicate that gene expression levels can be modulated via the core promoter."[5]

Human genes

"Nine elements were tested, representing a sampling of elements present in the two gene deserts and DACH introns, spread over a 1530-kb region surrounding the human DACH's TATA box."[6]

Gene ID: 1602 is the human gene DACH1 dachshund homolog 1 also known as DACH.[7] DACH1 has three isoforms: a, b, and c.

"[T]he human ... prostaglandin-endoperoxide-synthase-2 [gene contains] a canonical TATA box (nucleotide residues at positions -31 to -25 for the human gene)."[8] This is Gene ID: 5743.

The Drosophila hsp70 has a TATA box containing promoter.[9] This suggests that GeneID: 3308 HSPA4 heat shock 70kDa protein 4 [Homo sapiens], also known as hsp70,[10] has a TATA box in its core promoter.

Gene transcriptions

"From a teleological standpoint, this arrangement [of focused promoters] is consistent with the notion that it would be easier to regulate the transcription of a gene with a single transcription start site than one with multiple start sites."[5]

The TATA box is involved in the process of transcription by RNA polymerase.

Approximately “76% of human core promoters lack TATA-like elements, have a high GC content, and are enriched in Sp1 binding sites.”[3]

"[T]wo motifs - M3 (SCGGAAGY) and M22 (TGCGCANK) - ... occur preferentially in human TATA-less core promoters."[3]

"About 24% of human genes have a TATA-like element and their promoters are generally AT-rich; however, only ~10% of these TATA-containing promoters have the canonical TATA box (TATAWAWR). In contrast, ~46% of human core promoters contain the consensus INR (YYANWYY) and ~30% are INR-containing TATA-less genes."[3] W = A or T, Y = C or T, N = G, A, T, or C, and R = A or G.

Apparently, another ~46% of human promoters lack both TATA-like and consensus INR elements.

Transcription start sites

The consensus sequence is usually located 25 base pairs [(bps) or nucleotides (nts)] upstream [(-)] of the transcription site; i.e., the transcription start site (TSS).

Focused promoters

"In focused transcription, there is either a single major transcription start site or several start sites within a narrow region of several nucleotides. Focused transcription is the predominant mode of transcription in simpler organisms."[5]

"Focused transcription initiation occurs in all organisms, and appears to be the predominant or exclusive mode of transcription in simpler organisms."[5]

"In vertebrates, focused transcription tends to be associated with regulated promoters".[5]

"The analysis of focused core promoters has led to the discovery of sequence motifs such as the TATA box, BREu (upstream TFIIBrecognition element), Inr (initiator), MTE (motif ten element), DPE (downstream promoter element), DCE (downstream core element), and XCPE1 (Xcore promoter element 1) [...]."[5]

Dispersed promoters

"In dispersed transcription, there are several weak transcription start sites over a broad region of about 50 to 100 nucleotides. Dispersed transcription is the most common mode of transcription in vertebrates. For instance, dispersed transcription is observed in about two-thirds of human genes."[5]

In vertebrates, "dispersed transcription is typically observed in constitutive promoters in CpG islands."[5]

Core promoters

"Focused transcription typically initiates within the Inr, and the A nucleotide in the Inr consensus is usually designed as the “+ 1” position, whether or not transcription actually initiates at that particular nucleotide. This convention is useful because other core promoter motifs, such as the MTE and DPE, function with the Inr in a manner that exhibits a strict spacing dependence with the Inr consensus sequence (and hence, the A + 1 nucleotide) rather than the actual transcription start site (Burke and Kadonaga, 1997, Kutach and Kadonaga, 2000 and Lim et al., 2004)."[5]

"With TATA-driven core promoters, transcription can be achieved in vitro with purified RNA polymerase II, TFIIB, TFIID, TFIIE, TFIIF, and TFIIH."[5]

"NC2 (negative cofactor 2; also known as Dr1-Drap1) [...] was identified as repressor of TATA-dependent transcription [...]."[5]

"TBP (TATA box-binding protein) activates TATA transcription [...] The TBP subunit binds to the TATA box [...] TFIIA appears to promote the binding of TBP to the TATA box."[5]

TATA boxes

"The TATA box is the first core promoter motif that was discovered (Goldberg, 1979) as well as the best known core promoter element. The metazoan TATA box consensus is TATAWAAR, where the upstream T is usually located at − 31 or − 30 relative to the A + 1 (or G + 1) position in the Inr (Carninci et al., 2006 and Ponjavic et al., 2006). [The] TATA box is recognized and bound by the TBP subunit of the TFIID complex. Both the TATA box and TBP are conserved from archaebacteria to humans (Reeve, 2003). The TATA box is also present in plants (Molina and Grotewold, 2005, Yamamoto et al., 2007a and Yamamoto et al., 2007b). Although the TATA box is a well known core promoter motif, it is present in only about 10%–15% of mammalian core promoters (Carninci et al., 2006, Kim et al., 2005 and Cooper et al., 2006)."[5]

"The BRE (TFIIBrecognition element) was initially identified as a TFIIB binding sequence that is immediately upstream of a subset (∼ 10%–30%) of TATA box elements (Lagrange et al., 1998). In addition, a second TFIIB recognition site, the BREd (downstream TFIIB recognition element), was found immediately downstream of the TATA box (Deng and Roberts, 2005). The discovery of the BREd led to the renaming of the original BRE as BREu for upstream BRE (reviewed in Deng and Roberts, 2007). Both the BREu and BREd function in conjunction with a TATA box and have been found to increase as well as to decrease the levels of basal transcription ( Lagrange et al., 1998, Evans et al., 2001 and Deng and Roberts, 2005). More recent studies suggest a distinct role for the BREu in transcriptional regulation (Juven-Gershon et al., 2008a; [...])."[5]

"TRF3 (also known as TBP2 and TBPL2) appears to be present only in vertebrates and is the TRF that is most closely related to TBP. TRF3 can bind to TATA boxes and support TATA-dependent transcription (Bártfai et al., 2004 and Jallow et al., 2004). TRF3 was found to be important for embryonic development (Bártfai et al., 2004 and Jallow et al., 2004). In addition, zebrafish embryos that are depleted of TRF3 exhibit multiple developmental defects and fail to undergo hematopoiesis (Hart et al., 2007)."[5]

"The top six sequences form a cohort of the eight-member consensus TATA(A/T)A(A/T)(A/G). Two missing members are TATA(A/T)ATG. This sequence ends in an ATG translational start codon, and thus is expected to be underrepresented in promoters. Since it is nevertheless part of the larger consensus that avidly binds TBP, this sequence was included in the TATA consensus, although it is rarely used."[11]

Hypotheses

  1. A1BG is not transcribed using a TATA box.

TATA box (Butler 2002) samplings

The diagram shows an overview of the four core promoter elements B recognition element (BRE), TATA box, initiator element (Inr), and downstream promoter element (DPE), with their respective consensus sequences and their distance from the transcription start site. Credit: Jennifer E.F. Butler & James T. Kadonaga.

For the Basic programs testing consensus sequence TATAAA (starting with SuccessablesTATAB.bas) written to compare nucleotide sequences with the sequences on either the template strand (-), or coding strand (+), of the DNA, in the negative direction (-), or the positive direction (+), the programs are, are looking for, and found:

  1. Negative strand, negative direction: 2, TATAAA at 2852, TATAAA at 1602.
  2. Positive strand, negative direction: 3, TATAAA at 2874, TATAAA at 221, TATAAA at 182.
  3. negative strand, positive direction: 0.
  4. positive strand, positive direction: 0.
  5. inverse complement, negative strand, negative direction: 3, TTTATA at 2869, TTTATA at 2638, TTTATA at 1740.
  6. inverse complement, positive strand, negative direction: 1, TTTATA at 219.
  7. inverse complement, negative strand, positive direction: 1, TTTATA at 2588.
  8. inverse complement, positive strand, positive direction: 0.

TATAB (4560-2846) UTRs

  1. Negative strand, negative direction: TATAAA at 2852.
  2. Negative strand, negative direction: TTTATA at 2869.
  3. Positive strand, negative direction: TATAAA at 2874.

TATAB negative direction (2811-2596) proximal promoters

  1. Negative strand, negative direction: TTTATA at 2638.

TATAB negative direction (2596-1) distal promoters

  1. Negative strand, negative direction: TATAAA at 1602.
  2. Negative strand, negative direction: TTTATA at 1740.
  3. Positive strand, negative direction: TATAAA at 221, TATAAA at 182.
  4. Positive strand, negative direction: TTTATA at 219.

TATAB positive direction (4050-1) distal promoters

  1. Negative strand, positive direction: TTTATA at 2588.

TATA box (Butler 2002) random dataset samplings

  1. TATABr0: 2, TATAAA at 3565, TATAAA at 499.
  2. TATABr1: 0.
  3. TATABr2: 1, TATAAA at 3856.
  4. TATABr3: 1, TATAAA at 4444.
  5. TATABr4: 2, TATAAA at 3685, TATAAA at 733.
  6. TATABr5: 1, TATAAA at 1563.
  7. TATABr6: 0.
  8. TATABr7: 1, TATAAA at 3629.
  9. TATABr8: 2, TATAAA at 706, TATAAA at 555.
  10. TATABr9: 3, TATAAA at 4219, TATAAA at 3150, TATAAA at 2621.
  11. TATABr0ci: 2, TTTATA at 1139, TTTATA at 497.
  12. TATABr1ci: 0.
  13. TATABr2ci: 0.
  14. TATABr3ci: 0.
  15. TATABr4ci: 1, TTTATA at 4527.
  16. TATABr5ci: 1, TTTATA at 2178.
  17. TATABr6ci: 0.
  18. TATABr7ci: 2, TTTATA at 4252, TTTATA at 2452.
  19. TATABr8ci: 1, TTTATA at 4222.
  20. TATABr9ci: 2, TTTATA at 3988, TTTATA at 162.

TATABr arbitrary (evens) (4560-2846) UTRs

  1. TATABr0: TATAAA at 3565.
  2. TATABr2: TATAAA at 3856.
  3. TATABr4: TATAAA at 3685.
  4. TATABr4ci: TTTATA at 4527.
  5. TATABr8ci: TTTATA at 4222.

TATABr alternate (odds) (4560-2846) UTRs

  1. TATABr3: TATAAA at 4444.
  2. TATABr9: TATAAA at 4219, TATAAA at 3150.
  3. TATABr7ci: TTTATA at 4252, TTTATA at 2452.
  4. TATABr9ci: TTTATA at 3988.

TATABr arbitrary positive direction (odds) (4445-4265) core promoters

  1. TATABr3: TATAAA at 4444.

TATABr arbitrary positive direction (odds) (4265-4050) proximal promoters

  1. TATABr9: TATAAA at 4219.
  2. TATABr7ci: TTTATA at 4252.

TATABr alternate positive direction (evens) (4265-4050) proximal promoters

  1. TATABr8ci: TTTATA at 4222.

TATABr arbitrary negative direction (evens) (2596-1) distal promoters

  1. TATABr0: TATAAA at 499.
  2. TATABr4: TATAAA at 733.
  3. TATABr8: TATAAA at 706, TATAAA at 555.
  4. TATABr0ci: TTTATA at 1139, TTTATA at 497.

TATABr alternate negative direction (odds) (2596-1) distal promoters

  1. TATABr5: TATAAA at 1563.
  2. TATABr5ci: TTTATA at 2178.
  3. TATABr7ci: TTTATA at 2452.
  4. TATABr9ci: TTTATA at 162.

TATABr arbitrary positive direction (odds) (4050-1) distal promoters

  1. TATABr5: TATAAA at 1563.
  2. TATABr9: TATAAA at 3150, TATAAA at 2621.
  3. TATABr5ci: TTTATA at 2178.
  4. TATABr7ci: TTTATA at 2452.
  5. TATABr9ci: TTTATA at 3988, TTTATA at 162.

TATABr alternate positive direction (evens) (4050-1) distal promoters

  1. TATABr0: TATAAA at 3565, TATAAA at 499.
  2. TATABr2: TATAAA at 3856.
  3. TATABr4: TATAAA at 3685, TATAAA at 733.
  4. TATABr8: TATAAA at 706, TATAAA at 555.
  5. TATABr0ci: TTTATA at 1139, TTTATA at 497.

TATA box (Butler 2002) analysis and results

TATA box consensus sequence TATAAA.[12]

Reals or randoms Promoters direction Numbers Strands Occurrences Averages (± 0.1)
Reals UTR negative 3 2 1.5 1.5 ± 0.5 (--2,+-1)
Randoms UTR arbitrary negative 5 10 0.5 0.55
Randoms UTR alternate negative 6 10 0.6 0.55
Reals Core negative 0 2 0 0
Randoms Core arbitrary negative 0 10 0 0
Randoms Core alternate negative 0 10 0 0
Reals Core positive 0 2 0 0
Randoms Core arbitrary positive 1 10 0.1 0.05
Randoms Core alternate positive 0 10 0 0.05
Reals Proximal negative 1 2 0.5 0.5 ± 0.5 (--1,+-0)
Randoms Proximal arbitrary negative 0 10 0 0
Randoms Proximal alternate negative 0 10 0 0
Reals Proximal positive 0 2 0 0
Randoms Proximal arbitrary positive 2 10 0.2 0.15
Randoms Proximal alternate positive 1 10 0.1 0.15
Reals Distal negative 5 2 2.5 2.5 ± 0.5 (--2,+-3)
Randoms Distal arbitrary negative 6 10 0.6 0.5
Randoms Distal alternate negative 4 10 0.4 0.5
Reals Distal positive 1 2 0.5 0.5 ± 0.5 (-+1,++0)
Randoms Distal arbitrary positive 7 10 0.7 0.8
Randoms Distal alternate positive 9 10 0.9 0.8

Comparison:

The occurrences of real TATAB UTRs, proximals and distals are greater than the randoms. This suggests that the real TATABs are likely active or activable.

TATA boxes (Carninci 2006) samplings

For the Basic programs testing consensus sequence TATAAAA (starting with SuccessablesTATACA--.bas.bas) written to compare nucleotide sequences with the sequences on either the template strand (-), or coding strand (+), of the DNA, in the negative direction (-), or the positive direction (+), the programs are, are looking for, and found:

  1. negative strand, negative direction: 1, TATAAAA at 2853.
  2. positive strand, negative direction: 2, TATAAAA at 222, TATAAAA at 183.
  3. negative strand, positive direction: 0.
  4. positive strand, positive direction: 0.
  5. inverse complement, negative strand, negative direction: 2, TTTTATA at 2869, TTTTATA at 1740.
  6. inverse complement, positive strand, negative direction: 1, TTTTATA at 219.
  7. inverse complement, negative strand, positive direction: 0.
  8. inverse complement, positive strand, positive direction: 0.

TATAAAA (4560-2846) UTRs

  1. Negative strand, negative direction: TATAAAA at 2853.
  2. Negative strand, negative direction: TTTTATA at 2869.

TATAAAA negative direction (2596-1) distal promoters

  1. Negative strand, negative direction: TTTTATA at 1740.
  2. Positive strand, negative direction: TATAAAA at 222, TATAAAA at 183.
  3. Positive strand, negative direction: TTTTATA at 219.

TATA boxes (Carninci 2006) random dataset samplings

  1. TATACr0: 0.
  2. TATACr1: 0.
  3. TATACr2: 0.
  4. TATACr3: 1, TATAAAA at 4445.
  5. TATACr4: 0.
  6. TATACr5: 0.
  7. TATACr6: 0.
  8. TATACr7: 0.
  9. TATACr8: 1, TATAAAA at 707.
  10. TATACr9: 1, TATAAAA at 2622.
  11. TATACr0ci: 1, TTTTATA at 497.
  12. TATACr1ci: 0.
  13. TATACr2ci: 0.
  14. TATACr3ci: 0.
  15. TATACr4ci: 1, TTTTATA at 4527.
  16. TATACr5ci: 0.
  17. TATACr6ci: 0.
  18. TATACr7ci: 0.
  19. TATACr8ci: 0.
  20. TATACr9ci: 1, TTTTATA at 162.

TATACr arbitrary (evens) (4560-2846) UTRs

  1. TATACr4ci: TTTTATA at 4527.

TATACr alternate (odds) (4560-2846) UTRs

  1. TATACr3: TATAAAA at 4445.

TATACr arbitrary positive direction (odds) (4445-4265) core promoters

  1. TATACr3: TATAAAA at 4445.

TATACr alternate negative direction (odds) (2811-2596) proximal promoters

  1. TATACr9: TATAAAA at 2622.

TATACr arbitrary negative direction (evens) (2596-1) distal promoters

  1. TATACr8: TATAAAA at 707.
  2. TATACr0ci: TTTTATA at 497.

TATACr alternate negative direction (odds) (2596-1) distal promoters

  1. TATACr9ci: TTTTATA at 162.

TATACr arbitrary positive direction (odds) (4050-1) distal promoters

  1. TATACr9: TATAAAA at 2622.
  2. TATACr9ci: TTTTATA at 162.

TATACr alternate positive direction (evens) (4050-1) distal promoters

  1. TATACr8: TATAAAA at 707.
  2. TATACr0ci: TTTTATA at 497.

TATAAAA (Carninci 2006) analysis and results

A genome-wide study put the fraction of TATAAAA-dependent human promoters at ~10%.[13]

Reals or randoms Promoters direction Numbers Strands Occurrences Averages (± 0.1)
Reals UTR negative 2 2 1 1 ± 1 (--2,+-0)
Randoms UTR arbitrary negative 1 10 0.1 0.1
Randoms UTR alternate negative 1 10 0.1 0.1
Reals Core negative 0 2 0 0
Randoms Core arbitrary negative 0 10 0 0
Randoms Core alternate negative 0 10 0 0
Reals Core positive 0 2 0 0
Randoms Core arbitrary positive 1 10 0.1 0.05
Randoms Core alternate positive 0 10 0 0.05
Reals Proximal negative 0 2 0 0
Randoms Proximal arbitrary negative 0 10 0 0.05
Randoms Proximal alternate negative 1 10 0.1 0.05
Reals Proximal positive 0 2 0 0
Randoms Proximal arbitrary positive 0 10 0 0
Randoms Proximal alternate positive 0 10 0 0
Reals Distal negative 4 2 2 2 ± 1 (--1,+-3)
Randoms Distal arbitrary negative 2 10 0.2 0.15
Randoms Distal alternate negative 1 10 0.1 0.15
Reals Distal positive 0 2 0 0
Randoms Distal arbitrary positive 2 10 0.2 0.2
Randoms Distal alternate positive 2 10 0.2 0.2

Comparison:

The occurrences of real TATACs are greater than the randoms. This suggests that the real TATACs are likely active or activable.

TATA box (Watson 2014) samplings

For the Basic programs testing consensus sequence TATA(A/T)A(A/T) (starting with SuccessablesTATAW.bas) written to compare nucleotide sequences with the sequences on either the template strand (-), or coding strand (+), of the DNA, in the negative direction (-), or the positive direction (+), the programs are, are looking for, and found:

  1. Negative strand, negative direction: 4, TATATAT at 2872, TATAAAA at 2853, TATATAA at 1601, TATATAT at 1599.
  2. Positive strand, negative direction: 3, TATATAA at 2873, TATATAT at 1600, TATAAAA at 222, TATAAAA at 183.
  3. Negative strand, positive direction: 0.
  4. Positive strand, positive direction: 0.
  5. inverse complement, negative strand, negative direction: 5, TTATATA at 2871, TTTTATA at 2869, ATTTATA at 2638, TTTTATA at 1740.
  6. inverse complement, positive strand, negative direction: 2, ATATATA at 1599, TTTTATA at 219.
  7. inverse complement, negative strand, positive direction: 0.
  8. inverse complement, positive strand, positive direction: 0.

TATAW (4560-2846) UTRs

  1. Negative strand, negative direction: TATATAT at 2872, TATAAAA at 2853.
  2. Negative strand, negative direction: TTATATA at 2871, TTTTATA at 2869.
  3. Positive strand, negative direction: TATATAA at 2873.

TATAW negative direction (2811-2596) proximal promoters

  1. Negative strand, negative direction: ATTTATA at 2638.

TATAW negative direction (2596-1) distal promoters

  1. Negative strand, negative direction: TATATAA at 1601, TATATAT at 1599.
  2. Negative strand, negative direction: TTTTATA at 1740.
  3. Positive strand, negative direction: TATATAT at 1600, TATAAAA at 222, TATAAAA at 183.
  4. Positive strand, negative direction: TTTTATA at 219.

TATA box (Watson 2014) random dataset samplings

  1. TATAWr0: 1, TATAAAT at 3566.
  2. TATAWr1: 1, TATATAT at 139.
  3. TATAWr2: 0.
  4. TATAWr3: 1, TATAAAA at 4445.
  5. TATAWr4: 0.
  6. TATAWr5: 1, TATAAAT at 1564.
  7. TATAWr6: 0.
  8. TATAWr7: 1, TATAAAT at 3630.
  9. TATAWr8: 1, TATAAAA at 707.
  10. TATAWr9: 2, TATATAA at 3149, TATAAAA at 2622.
  11. TATAWr0ci: 2, ATATATA at 2701, TTTTATA at 497.
  12. TATAWr1ci: 1, TTATATA at 1267.
  13. TATAWr2ci: 0.
  14. TATAWr3ci: 1, ATATATA at 3803.
  15. TATAWr4ci: 2, TTTTATA at 4527, TTATATA at 4254.
  16. TATAWr5ci: 0.
  17. TATAWr6ci: 1, TTATATA at 833.
  18. TATAWr7ci: 1, TTATATA at 4254.
  19. TATAWr8ci: 0.
  20. TATAWr9ci: 1, TTTTATA at 162.

TATAWr arbitrary (evens) (4560-2846) UTRs

  1. TATAWr0: TATAAAT at 3566.
  2. TATAWr4ci: TTTTATA at 4527, TTATATA at 4254.

TATAWr alternate (odds) (4560-2846) UTRs

  1. TATAWr3: TATAAAA at 4445.
  2. TATAWr7: TATAAAT at 3630.
  3. TATAWr9: TATATAA at 3149.
  4. TATAWr3ci: ATATATA at 3803.
  5. TATAWr7ci: TTATATA at 4254.

TATAWr arbitrary positive direction (odds) (4445-4265) core promoters

  1. TATAWr3: TATAAAA at 4445.

TATAWr arbitrary negative direction (evens) (2811-2596) proximal promoters

  1. TATAWr0ci: ATATATA at 2701.

TATAWr alternate negative direction (odds) (2811-2596) proximal promoters

  1. TATAWr9: TATAAAA at 2622.

TATAWr arbitrary positive direction (odds) (4265-4050) proximal promoters

  1. TATAWr7ci: TTATATA at 4254.

TATAWr alternate positive direction (evens) (4265-4050) proximal promoters

  1. TATAWr4ci: TTATATA at 4254.

TATAWr arbitrary negative direction (evens) (2596-1) distal promoters

  1. TATAWr8: TATAAAA at 707.
  2. TATAWr0ci: TTTTATA at 497.
  3. TATAWr6ci: TTATATA at 833.

TATAWr alternate negative direction (odds) (2596-1) distal promoters

  1. TATAWr1: TATATAT at 139.
  2. TATAWr5: TATAAAT at 1564.
  3. TATAWr1ci: TTATATA at 1267.
  4. TATAWr9ci: TTTTATA at 162.

TATAWr arbitrary positive direction (odds) (4050-1) distal promoters

  1. TATAWr1: TATATAT at 139.
  2. TATAWr5: TATAAAT at 1564.
  3. TATAWr7: TATAAAT at 3630.
  4. TATAWr9: TATATAA at 3149, TATAAAA at 2622.
  5. TATAWr1ci: TTATATA at 1267.
  6. TATAWr3ci: ATATATA at 3803.
  7. TATAWr9ci: TTTTATA at 162.

TATAWr alternate positive direction (evens) (4050-1) distal promoters

  1. TATAWr0: TATAAAT at 3566.
  2. TATAWr8: TATAAAA at 707.
  3. TATAWr0ci: ATATATA at 2701, TTTTATA at 497.
  4. TATAWr6ci: TTATATA at 833.

TATA box (Watson 2014) analysis and results

The TATA box is a component of the eukaryotic core promoter and generally contains the consensus sequence TATA(A/T)A(A/T).[14]

Reals or randoms Promoters direction Numbers Strands Occurrences Averages (± 0.1)
Reals UTR negative 5 2 2.5 2.5 ± 1.5 (--4,+-1)
Randoms UTR arbitrary negative 3 10 0.3 0.4
Randoms UTR alternate negative 5 10 0.5 0.4
Reals Core negative 0 2 0 0
Randoms Core arbitrary negative 0 10 0 0
Randoms Core alternate negative 0 10 0 0
Reals Core positive 0 2 0 0
Randoms Core arbitrary positive 1 10 0.1 0.05
Randoms Core alternate positive 0 10 0 0.05
Reals Proximal negative 1 2 0.5 0.5
Randoms Proximal arbitrary negative 1 10 0.1 0.1
Randoms Proximal alternate negative 1 10 0.1 0.1
Reals Proximal positive 1 2 0.5 0.5 ± 0.5 (-+1,++0)
Randoms Proximal arbitrary positive 1 10 0.1 0.1
Randoms Proximal alternate positive 1 10 0.1 0.1
Reals Distal negative 7 2 3.5 3.5 ± 0.5 (--3,+-4)
Randoms Distal arbitrary negative 3 10 0.3 0.35
Randoms Distal alternate negative 4 10 0.4 0.35
Reals Distal positive 0 2 0 0
Randoms Distal arbitrary positive 8 10 0.8 0.65
Randoms Distal alternate positive 5 10 0.5 0.65

Comparison:

The occurrences of real TATAW UTRs, proximals and distals are greater than the randoms. This suggests that the real TATAWs are likely active or activable.

TATA box (Juven-Gershon 2010) samplings

For the Basic programs (starting with SuccessablesTATAJ.bas) written to compare nucleotide sequences with the sequences on either the template strand (-), or coding strand (+), of the DNA, in the negative direction (-), or the positive direction (+), the programs are, are looking for, and found:

  1. Negative strand, negative direction, looking for TATA(A/T)AA(A/G): 1, TATATAAA at 1602.
  2. Positive strand, negative direction: 3, TATATAAA at 2874, TATAAAAG at 223, TATAAAAG at 184.
  3. Negative strand, positive direction: 0.
  4. Positive strand, positive direction: 0.
  5. inverse complement, negative strand, negative direction, looking for (C/T)TT(A/T)TATA: 1, TTTATATA at 2871.
  6. inverse complement, positive strand, negative direction: 1, TTTTTATA at 219.
  7. inverse complement, negative strand, positive direction: 0.
  8. inverse complement, positive strand, positive direction: 0.

TATAJ (4560-2846) UTRs

  1. Negative strand, negative direction: TTTATATA at 2871.
  2. Positive strand, negative direction: TATATAAA at 2874.

TATAJ negative direction (2596-1) distal promoters

  1. Negative strand, negative direction: TATATAAA at 1602.
  2. Positive strand, negative direction: TATAAAAG at 223, TATAAAAG at 184.
  3. Positive strand, negative direction: TTTTTATA at 219.

TATA box (Juven-Gershon 2010) random dataset samplings

  1. TATAJr0: 0.
  2. TATAJr1: 0.
  3. TATAJr2: 0.
  4. TATAJr3: 0.
  5. TATAJr4: 0.
  6. TATAJr5: 0.
  7. TATAJr6: 0.
  8. TATAJr7: 0.
  9. TATAJr8: 1, TATAAAAA at 708.
  10. TATAJr9: 1, TATATAAA at 3150.
  11. TATAJr0ci: 0.
  12. TATAJr1ci: 0.
  13. TATAJr2ci: 0.
  14. TATAJr3ci: 0.
  15. TATAJr4ci: 2, CTTTTATA at 4527, CTTATATA at 4254.
  16. TATAJr5ci: 0.
  17. TATAJr6ci: 1, CTTATATA at 833.
  18. TATAJr7ci: 1, TTTATATA at 4254.
  19. TATAJr8ci: 0.
  20. TATAJr9ci: 0.

TATAJr arbitrary (evens) (4560-2846) UTRs

  1. TATAJr4ci: CTTTTATA at 4527, CTTATATA at 4254.

TATAJr alternate (odds) (4560-2846) UTRs

  1. TATAJr9: TATATAAA at 3150.
  2. TATAJr7ci: TTTATATA at 4254.

TATAJr alternate positive direction (evens) (4445-4265) core promoters

  1. TATAJr4ci: CTTATATA at 4254.

TATAJr arbitrary positive direction (odds) (4265-4050) proximal promoters

  1. TATAJr7ci: TTTATATA at 4254.

TATAJr alternate positive direction (evens) (4265-4050) proximal promoters

  1. TATAJr4ci: CTTATATA at 4254.

TATAJr arbitrary negative direction (evens) (2596-1) distal promoters

  1. TATAJr8: TATAAAAA at 708.
  2. TATAJr6ci: CTTATATA at 833.

TATAJr arbitrary positive direction (odds) (4050-1) distal promoters

  1. TATAJr9: TATATAAA at 3150.

TATAJr alternate positive direction (evens) (4050-1) distal promoters

  1. TATAJr8: TATAAAAA at 708.
  2. TATAJr6ci: CTTATATA at 833.

TATA box (Juven-Gershon 2010) analysis and results

"The metazoan TATA box consensus is TATAWAAR [...]."[5]

Reals or randoms Promoters direction Numbers Strands Occurrences Averages (± 0.1)
Reals UTR negative 2 2 1 1 ± 0 (--1,+-1)
Randoms UTR arbitrary negative 2 10 0.2 0.2
Randoms UTR alternate negative 2 10 0.2 0.2
Reals Core negative 0 2 0 0
Randoms Core arbitrary negative 0 10 0 0
Randoms Core alternate negative 0 10 0 0
Reals Core positive 0 2 0 0
Randoms Core arbitrary positive 0 10 0 0.05
Randoms Core alternate positive 1 10 0.1 0.05
Reals Proximal negative 0 2 0 0
Randoms Proximal arbitrary negative 0 10 0 0
Randoms Proximal alternate negative 0 10 0 0
Reals Proximal positive 0 2 0 0
Randoms Proximal arbitrary positive 1 10 0.1 0.1
Randoms Proximal alternate positive 1 10 0.1 0.1
Reals Distal negative 4 2 2 2 ± 1 (--1,+-3)
Randoms Distal arbitrary negative 2 10 0.2 0.1
Randoms Distal alternate negative 0 10 0 0.1
Reals Distal positive 0 2 0 0
Randoms Distal arbitrary positive 1 10 0.1 0.15
Randoms Distal alternate positive 2 10 0.2 0.15

Comparison:

The occurrences of real TATAJ UTRs and distals are greater than the randoms. This suggests that the real TATAJs are likely active or activable.

TATA box (Basehoar 2004) samplings

For the Basic programs (starting with SuccessablesTATA.bas) written to compare nucleotide sequences with the sequences on either the template strand (-), or coding strand (+), of the DNA, in the negative direction (-), or the positive direction (+), the programs are, are looking for, and found:

  1. Negative strand, negative direction, looking for TATA(A/T)A(A/T)(A/G): 2, TATATAAA at 1602, TATATATA at 1600.
  2. Positive strand, negative direction: 3, TATATAAA at 2874, TATAAAAG at 223, TATAAAAG at 184.
  3. Negative strand, positive direction: 0.
  4. Positive strand, positive direction: 0.
  5. inverse complement, negative strand, negative direction: 2, TTTATATA at 2871, TATATATA at 1600.
  6. inverse complement, positive strand, negative direction: 1, TTTTTATA at 219.
  7. inverse complement, negative strand, positive direction: 0.
  8. inverse complement, positive strand, positive direction: 0.

TATA (4560-2846) UTRs

  1. Negative strand, negative direction: TTTATATA at 2871.
  2. Positive strand, negative direction: TATATAAA at 2874.

TATA negative direction (2596-1) distal promoters

  1. Negative strand, negative direction: TATATAAA at 1602, TATATATA at 1600.
  2. Positive strand, negative direction: TATAAAAG at 223, TATAAAAG at 184.
  3. Positive strand, negative direction: TTTTTATA at 219.

TATA box (Basehoar 2004) random dataset samplings

  1. TATAr0: 0.
  2. TATAr1: 0.
  3. TATAr2: 0.
  4. TATAr3: 0.
  5. TATAr4: 0.
  6. TATAr5: 1, TATAAATG at 1565.
  7. TATAr6: 0.
  8. TATAr7: 1, TATAAATA at 3631.
  9. TATAr8: 1, TATAAAAA at 708.
  10. TATAr9: 1, TATATAAA at 3150.
  11. TATAr0ci: 0.
  12. TATAr1ci: 0.
  13. TATAr2ci: 0.
  14. TATAr3ci: 0.
  15. TATAr4ci: 2, CTTTTATA at 4527, CTTATATA at 4254.
  16. TATAr5ci: 0.
  17. TATAr6ci: 1, CTTATATA at 833.
  18. TATAr7ci: 1, TTTATATA at 4254.
  19. TATAr8ci: 0.
  20. TATAr9ci: 0.

TATAr arbitrary (evens) (4560-2846) UTRs

  1. TATAr4ci: CTTTTATA at 4527, CTTATATA at 4254.

TATAr alternate (odds) (4560-2846) UTRs

  1. TATAr7: TATAAATA at 3631.
  2. TATAr9: TATATAAA at 3150.
  3. TATAr7ci: TTTATATA at 4254.

TATAr arbitrary positive direction (odds) (4265-4050) proximal promoters

  1. TATAr7ci: TTTATATA at 4254.

TATAr alternate positive direction (evens) (4265-4050) proximal promoters

  1. TATAr4ci: CTTATATA at 4254.

TATAr arbitrary negative direction (evens) (2596-1) distal promoters

  1. TATAr8: TATAAAAA at 708.
  2. TATAr6ci: CTTATATA at 833.

TATAr alternate negative direction (odds) (2596-1) distal promoters

  1. TATAr5: TATAAATG at 1565.

TATAr arbitrary positive direction (odds) (4050-1) distal promoters

  1. TATAr5: TATAAATG at 1565.
  2. TATAr7: TATAAATA at 3631.
  3. TATAr9: TATATAAA at 3150.

TATAr alternate positive direction (evens) (4050-1) distal promoters

  1. TATAr8: TATAAAAA at 708.
  2. TATAr6ci: CTTATATA at 833.

TATA box (Basehoar 2004) analysis and results

"About 24% of human genes have a TATA-like element and their promoters are generally AT-rich; however, only ~10% of these TATA-containing promoters have the canonical TATA box (TATAWAWR)."[3] Several Saccharomyces genomes had the consensus sequence TATA(A/T)A(A/T)(A/G), yet only about 20% of yeast genes even contained the TATA sequence.[11]

Reals or randoms Promoters direction Numbers Strands Occurrences Averages (± 0.1)
Reals UTR negative 2 2 1 1 (--1,+-1)
Randoms UTR arbitrary negative 2 10 0.2 0.25
Randoms UTR alternate negative 3 10 0.3 0.25
Reals Core negative 0 2 0 0
Randoms Core arbitrary negative 0 10 0 0
Randoms Core alternate negative 0 10 0 0
Reals Core positive 0 2 0 0
Randoms Core arbitrary positive 0 10 0 0
Randoms Core alternate positive 0 10 0 0
Reals Proximal negative 0 2 0 0
Randoms Proximal arbitrary negative 0 10 0 0
Randoms Proximal alternate negative 0 10 0 0
Reals Proximal positive 0 2 0 0
Randoms Proximal arbitrary positive 1 10 0.1 0
Randoms Proximal alternate positive 1 10 0.1 0
Reals Distal negative 5 2 2.5 2.5 ± 0.5 (--2,+-3)
Randoms Distal arbitrary negative 2 10 0.2 0.15
Randoms Distal alternate negative 1 10 0.1 0.15
Reals Distal positive 0 2 0 0
Randoms Distal arbitrary positive 3 10 0.3 0.25
Randoms Distal alternate positive 2 10 0.2 0.25

Comparison:

The occurrences of real TATA UTRs and distals are greater than the randoms. This suggests that the real TATAs are likely active or activable.

M3 motif samplings

For the Basic programs testing consensus sequence (C/G)CGGAAG(C/T) (starting with SuccessablesM3.bas) written to compare nucleotide sequences with the sequences on either the template strand (-), or coding strand (+), of the DNA, in the negative direction (-), or the positive direction (+), the programs are, are looking for, and found:

  1. Negative strand, negative direction: 0.
  2. Positive strand, negative direction: 1, GCGGAAGT at 2731.
  3. Negative strand, positive direction: 0.
  4. Positive strand, positive direction: 7, CCGGAAGC at 1517, GCGGAAGC at 1306, CCGGAAGT at 1265, CCGGAAGC at 1013, CCGGAAGT at 929, CCGGAAGT at 829, CCGGAAGC at 593.
  5. inverse complement, negative strand, negative direction: 1, GCTTCCGT at 1558.
  6. inverse complement, positive strand, negative direction: 0.
  7. inverse complement, negative strand, positive direction: 0.
  8. inverse complement, positive strand, positive direction: 0.

M3 negative direction (2811-2596) proximal promoters

  1. Positive strand, negative direction: GCGGAAGT at 2731.

M3 negative direction (2596-1) distal promoters

  1. Negative strand, negative direction: GCTTCCGT at 1558.

M3 positive direction (4050-1) distal promoters

  1. Positive strand, positive direction: CCGGAAGC at 1517, GCGGAAGC at 1306, CCGGAAGT at 1265, CCGGAAGC at 1013, CCGGAAGT at 929, CCGGAAGT at 829, CCGGAAGC at 593.

M3 random dataset samplings

  1. M3r0: 0.
  2. M3r1: 0.
  3. M3r2: 1, GCGGAAGT at 3374.
  4. M3r3: 0.
  5. M3r4: 1, GCGGAAGT at 2555.
  6. M3r5: 0.
  7. M3r6: 2, GCGGAAGC at 2580, CCGGAAGT at 1193.
  8. M3r7: 0.
  9. M3r8: 1, GCGGAAGC at 2757.
  10. M3r9: 0.
  11. M3r0ci: 0.
  12. M3r1ci: 0.
  13. M3r2ci: 1, ACTTCCGG at 3397.
  14. M3r3ci: 1, GCTTCCGC at 1824.
  15. M3r4ci: 0.
  16. M3r5ci: 0.
  17. M3r6ci: 0.
  18. M3r7ci: 0.
  19. M3r8ci: 1, ACTTCCGC at 4254.
  20. M3r9ci: 1, GCTTCCGG at 3340.

M3r arbitrary (evens) (4560-2846) UTRs

  1. M3r2: GCGGAAGT at 3374.
  2. M3r2ci: ACTTCCGG at 3397.
  3. M3r8ci: ACTTCCGC at 4254.

M3r alternate (odds) (4560-2846) UTRs

  1. M3r9ci: GCTTCCGG at 3340.

M3r arbitrary negative direction (evens) (2811-2596) proximal promoters

  1. M3r8: GCGGAAGC at 2757.

M3r alternate positive direction (evens) (4265-4050) proximal promoters

  1. M3r8ci: ACTTCCGC at 4254.

M3r arbitrary negative direction (evens) (2596-1) distal promoters

  1. M3r4: GCGGAAGT at 2555.
  2. M3r6: GCGGAAGC at 2580, CCGGAAGT at 1193.

M3r alternate negative direction (odds) (2596-1) distal promoters

  1. M3r3ci: GCTTCCGC at 1824.

M3r arbitrary positive direction (odds) (4050-1) distal promoters

  1. M3r3ci: GCTTCCGC at 1824.
  2. M3r9ci: GCTTCCGG at 3340.

M3r alternate positive direction (evens) (4050-1) distal promoters

  1. M3r2: GCGGAAGT at 3374.
  2. M3r4: GCGGAAGT at 2555.
  3. M3r6: GCGGAAGC at 2580, CCGGAAGT at 1193.
  4. M3r8: GCGGAAGC at 2757.
  5. M3r2ci: ACTTCCGG at 3397.

M3 analysis and results

M3 (SCGGAAGY) occurs preferentially in human TATA-less core promoters.[3]

Reals or randoms Promoters direction Numbers Strands Occurrences Averages (± 0.1)
Reals UTR negative 0 2 0 0
Randoms UTR arbitrary negative 3 10 0.3 0.2 ± 0.1
Randoms UTR alternate negative 1 10 0.1 0.2 ± 0.1
Reals Core negative 0 2 0 0
Randoms Core arbitrary negative 0 10 0 0
Randoms Core alternate negative 0 10 0 0
Reals Core positive 0 2 0 0
Randoms Core arbitrary positive 0 10 0 0
Randoms Core alternate positive 0 10 0 0
Reals Proximal negative 1 2 0.5 0.5 ± 0.5 (--0,+-1)
Randoms Proximal arbitrary negative 1 10 0.1 0.05
Randoms Proximal alternate negative 0 10 0 0.05
Reals Proximal positive 0 2 0 0
Randoms Proximal arbitrary positive 0 10 0 0.05
Randoms Proximal alternate positive 1 10 0.1 0.05
Reals Distal negative 1 2 0.5 0.5 ± (--1,+-0)
Randoms Distal arbitrary negative 3 10 0.3 0.2 ± 0.1
Randoms Distal alternate negative 1 10 0.1 0.2 ± 0.1
Reals Distal positive 7 2 3.5 3.5 ± 3.5 (-+0,++7)
Randoms Distal arbitrary positive 2 10 0.2 0.4 ± 0.2
Randoms Distal alternate positive 6 10 0.6 0.4 ± 0.2

Comparison:

The occurrences of real M3 proximal and distals are greater than the randoms. This suggests that the real M3s are likely active or activable.

M22 samplings

For the Basic programs testing consensus sequence TGCGCAN(G/T) (starting with SuccessablesM22.bas) written to compare nucleotide sequences with the sequences on either the template strand (-), or coding strand (+), of the DNA, in the negative direction (-), or the positive direction (+), the programs are, are looking for, and found:

  1. Negative strand, negative direction: 0.
  2. Positive strand, negative direction: 0.
  3. Negative strand, positive direction: 1, TGCGCAAG at 1525.
  4. Positive strand, positive direction: 0.
  5. inverse complement, negative strand, negative direction: 0.
  6. inverse complement, positive strand, negative direction: 0.
  7. inverse complement, negative strand, positive direction: 3, CGTGCGCA at 1523, CCTGCGCA at 1414, CCTGCGCA at 1314.
  8. inverse complement, positive strand, positive direction: 0.

M22 positive direction (4050-1) distal promoters

  1. Negative strand, positive direction: TGCGCAAG at 1525.
  2. Negative strand, positive direction: CGTGCGCA at 1523, CCTGCGCA at 1414, CCTGCGCA at 1314.

M22 random dataset samplings

  1. M22r0: 0.
  2. M22r1: 1, TGCGCAAT at 1589.
  3. M22r2: 0.
  4. M22r3: 0.
  5. M22r4: 0.
  6. M22r5: 0.
  7. M22r6: 0.
  8. M22r7: 0.
  9. M22r8: 0.
  10. M22r9: 0.
  11. M22r0ci: 0.
  12. M22r1ci: 0.
  13. M22r2ci: 0.
  14. M22r3ci: 0.
  15. M22r4ci: 0.
  16. M22r5ci: 0.
  17. M22r6ci: 0.
  18. M22r7ci: 0.
  19. M22r8ci: 0.
  20. M22r9ci: 0.

M22r alternate negative direction (odds) (2596-1) distal promoters

  1. M22r1: TGCGCAAT at 1589.

M22r arbitrary positive direction (odds) (4050-1) distal promoters

  1. M22r1: TGCGCAAT at 1589.

M22 analysis and results

M22 (TGCGCANK), where K = (G/T) occurs preferentially in human TATA-less core promoters.[3]

Reals or randoms Promoters direction Numbers Strands Occurrences Averages (± 0.1)
Reals UTR negative 0 2 0 0
Randoms UTR arbitrary negative 0 10 0 0
Randoms UTR alternate negative 0 10 0 0
Reals Core negative 0 2 0 0
Randoms Core arbitrary negative 0 10 0 0
Randoms Core alternate negative 0 10 0 0
Reals Core positive 0 2 0 0
Randoms Core arbitrary positive 0 10 0 0
Randoms Core alternate positive 0 10 0 0
Reals Proximal negative 0 2 0 0
Randoms Proximal arbitrary negative 0 10 0 0
Randoms Proximal alternate negative 0 10 0 0
Reals Proximal positive 0 2 0 0
Randoms Proximal arbitrary positive 0 10 0 0
Randoms Proximal alternate positive 0 10 0 0
Reals Distal negative 0 2 0 0
Randoms Distal arbitrary negative 0 10 0 0.05
Randoms Distal alternate negative 1 10 0.1 0.05
Reals Distal positive 4 2 2 2 ± 2 (-+4,++0)
Randoms Distal arbitrary positive 1 10 0.1 0.05
Randoms Distal alternate positive 0 10 0 0.05

Comparison:

The occurrences of real M22 distals are greater than the randoms. This suggests that the real M22s are likely active or activable.

Comparisons of TATA boxes for UTRs nn

Butler (2002) Carninci (2006) Watson (2014) Juven-Gershon (2008) Basehoar (2004)
(4560-2846) (4560-2846) (4560-2846) (4560-2846) (4560-2846)
TATAAA TATAAAA TATA(A/T)A(A/T) TATA(A/T)AA(A/G) TATA(A/T)A(A/T)(A/G)
- - TATATAT at 2872 - -
- - ciTTATATA at 2871 ciTTTATATA at 2871 ciTTTATATA at 2871
ciTTTATA at 2869 ciTTTTATA at 2869 ciTTTTATA at 2869 - -
TATAAA at 2852 TATAAAA at 2853 TATAAAA at 2853 - -

Comparisons of TATA boxes for UTRs pn

Butler (2002) Carninci (2006) Watson (2014) Juven-Gershon (2008) Basehoar (2004)
(4560-2846) (4560-2846) (4560-2846) (4560-2846) (4560-2846)
TATAAA TATAAAA TATA(A/T)A(A/T) TATA(A/T)AA(A/G) TATA(A/T)A(A/T)(A/G)
TATAAA at 2874 - TATATAA at 2873 TATATAAA at 2874 TATATAAA at 2874

Comparisons of TATA boxes for proximals nn

Butler (2002) Carninci (2006) Watson (2014) Juven-Gershon (2008) Basehoar (2004)
(2811-2596) (2811-2596) (2811-2596) (2811-2596) (2811-2596)
TATAAA TATAAAA TATA(A/T)A(A/T) TATA(A/T)AA(A/G) TATA(A/T)A(A/T)(A/G)
ciTTTATA at 2638 - ciATTTATA at 2638 - -


Comparisons of TATA boxes for distals nn

Butler (2002) Carninci (2006) Watson (2014) Juven-Gershon (2008) Basehoar (2004)
(2596-1) (2596-1) (2596-1) (2596-1) (2596-1)
TATAAA TATAAAA TATA(A/T)A(A/T) TATA(A/T)AA(A/G) TATA(A/T)A(A/T)(A/G)
ciTTTATA at 1740 ciTTTTATA at 1740 ciTTTTATA at 1740 - -
TATAAA at 1602 - TATATAA at 1601 TATATAAA at 1602 TATATAAA at 1602
- - TATATAT at 1599 - TATATATA at 1600

Comparisons of TATA boxes for distals pn

Butler (2002) Carninci (2006) Watson (2014) Juven-Gershon (2008) Basehoar (2004)
(2596-1) (2596-1) (2596-1) (2596-1) (2596-1)
TATAAA TATAAAA TATA(A/T)A(A/T) TATA(A/T)AA(A/G) TATA(A/T)A(A/T)(A/G)
- - TATATAT at 1600 - -
TATAAA at 221 TATAAAA at 222 TATAAAA at 222 TATAAAAG at 223 TATAAAAG at 223
ciTTTATA at 219 ciTTTTATA at 219 ciTTTTATA at 219 ciTTTTTATA at 219 ciTTTTTATA at 219
TATAAA at 182 TATAAAA at 183 TATAAAA at 183 TATAAAAG at 184 ciTATAAAAG at 184

Comparisons of TATA boxes for distals np

Butler (2002) Carninci (2006) Watson (2014) Juven-Gershon (2008) Basehoar (2004)
(4050-1) (4050-1) (4050-1) (4050-1) (4050-1)
TATAAA TATAAAA TATA(A/T)A(A/T) TATA(A/T)AA(A/G) TATA(A/T)A(A/T)(A/G)
ciTTTATA at 2588 - - - -

Acknowledgements

The content on this page was first contributed by: Henry A. Hoff.

Initial content for this page in some instances came from Wikiversity.

See also

References

  1. R. P. Lifton, M. L. Goldberg, R. W. Karp, and D. S. Hogness (1978). "The organization of the histone genes in Drosophila melanogaster: functional and evolutionary implications". Cold Spring Harbor Symposia on Quantitative Biology. 42: 1047–51. doi:10.1101/SQB.1978.042.01.105. PMID 98262.
  2. 2.0 2.1 Stephen T. Smale and James T. Kadonaga (July 2003). "The RNA Polymerase II Core Promoter" (PDF). Annual Review of Biochemistry. 72 (1): 449–79. doi:10.1146/annurev.biochem.72.121801.161520. PMID 12651739. Retrieved 2012-05-07.
  3. 3.0 3.1 3.2 3.3 3.4 3.5 3.6 C Yang, E Bolotin, T Jiang, FM Sladek, E Martinez (March 2007). "Prevalence of the initiator over the TATA box in human and yeast genes and identification of DNA motifs enriched in human TATA-less core promoters". Gene. 389 (1): 52–65. doi:10.1016/j.gene.2006.09.029. PMID 17123746.
  4. Stephen T. Smale (October 1, 2001). "Core promoters: active contributors to combinatorial gene regulation". Genes & Development. 15 (19): 2503–8. doi:10.1101/gad.937701. Retrieved 2012-04-28.
  5. 5.00 5.01 5.02 5.03 5.04 5.05 5.06 5.07 5.08 5.09 5.10 5.11 5.12 5.13 5.14 5.15 5.16 Tamar Juven-Gershon and James T. Kadonaga (15 March 2010). "Regulation of gene expression via the core promoter and the basal transcriptional machinery". Developmental Biology. 339 (2): 225–9. doi:10.1016/j.ydbio.2009.08.009. Retrieved 2016-01-16.
  6. Marcelo A. Nobrega, Ivan Ovcharenko, Veena Afzal, and Edward M. Rubin (October 2003). "Scanning human gene deserts for long-range enhancers". Science. 302 (5644): 413. doi:10.1126/science.1088328. PMID 14563999. Retrieved 2012-12-26.
  7. HGNC (December 20, 2012). "DACH1 dachshund homolog 1 (Drosophila) [ Homo sapiens ]". Bethsda, Maryland, USA: ncbi.nlm.nih. Retrieved 2012-12-26.
  8. Tetsuya Kosaka, Atsuro Miyata, Hayato Ihara, Shuntaro Hara, Tamiko Sugimoto, Osamu Takeda, Ei-ichi Takahashi, Tadashi Tanabe (May 1994). "Characterization of the human gene (PTGS2) encoding prostaglandin‐endoperoxide synthase 2". European Journal of Biochemistry. 221 (3): 889–97. doi:10.1111/j.1432-1033.1994.tb18804.x. Retrieved 2012-12-26.
  9. Thomas W. Burke and James T. Kadonaga (November 15, 1997). "The downstream core promoter element, DPE, is conserved from Drosophila to humans and is recognized by TAFII60 of Drosophila". Genes & Development. 11 (22): 3020–31. doi:10.1101/gad.11.22.3020. PMC 316699. PMID 9367984.
  10. HGNC (February 3, 2013). "HSPA4 heat shock 70kDa protein 4 [ Homo sapiens ]". 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information, U.S. National Library of Medicine. Retrieved 2013-02-07.
  11. 11.0 11.1 Basehoar, Andrew D.; Zanton, Sara J.; Pugh, B. Franklin (2004-03-05). "Identification and distinct regulation of yeast TATA box-containing genes". Cell. 116 (5): 699–709. ISSN 0092-8674. PMID 15006352.
  12. Jennifer E.F. Butler, James T. Kadonaga (October 15, 2002). "The RNA polymerase II core promoter: a key component in the regulation of gene expression". Genes & Development. 16 (20): 2583–292. doi:10.1101/gad.1026202. PMID 12381658.
  13. Carninci P., Sandelin A., Lenhard B., Katayama S., Shimokawa K., Ponjavic J., Semple C.A., Taylor M.S., Engström P.G., Frith M.C., Forrest A.R., Alkema W.B., Tan S.L., Plessy C., Kodzius R., Ravasi T., Kasukawa T., Fukuda S., Kanamori-Katayama M., Kitazume Y., Kawaji H., Kai C., Nakamura M., Konno H., Nakano K., Mottagui-Tabar S., Arner P., Chesi A., Gustincich S., Persichetti F., Suzuki H., Grimmond S.M., Wells C.A., Orlando V., Wahlestedt C., Liu E.T., Harbers M., Kawai J., Bajic V.B., Hume D.A., Hayashizaki Y. (2006). "Genome-wide analysis of mammalian promoter architecture and evolution". Nat. Genet. 38 (6): 626–35. doi:10.1038/ng1789. PMID 16645617.
  14. Molecular biology of the gene. Watson, James D., 1928- (Seventh ed.). Boston. ISBN 9780321762436. OCLC 824087979.

Further reading

External links

{{Phosphate biochemistry}}