Specificity protein gene transcriptions: Difference between revisions

Jump to navigation Jump to search
 
(14 intermediate revisions by the same user not shown)
Line 314: Line 314:
! Reals or randoms !! Promoters !! direction !! Numbers !! Strands !! Occurrences !! Averages (± 0.1)  
! Reals or randoms !! Promoters !! direction !! Numbers !! Strands !! Occurrences !! Averages (± 0.1)  
|-
|-
| Reals || UTR || negative || 2 || 2 || 1 || 1  
| Reals || UTR || negative || 2 || 2 || 1 || 1 ± 0 (--1,+-1)
|-
|-
| Randoms || UTR || arbitrary negative || 4 || 10 || 0.4 || 0.4  
| Randoms || UTR || arbitrary negative || 4 || 10 || 0.4 || 0.4  
Line 326: Line 326:
| Randoms || Core || alternate negative || 0 || 10 || 0 || 0.05  
| Randoms || Core || alternate negative || 0 || 10 || 0 || 0.05  
|-
|-
| Reals || Core || positive || 2 || 2 || 1 || 1  
| Reals || Core || positive || 2 || 2 || 1 || 1 ± 1 (-+0,++2)
|-
|-
| Randoms || Core || arbitrary positive || 0 || 10 || 0 || 0
| Randoms || Core || arbitrary positive || 0 || 10 || 0 || 0
Line 338: Line 338:
| Randoms || Proximal || alternate negative || 0 || 10 || 0 || 0.05  
| Randoms || Proximal || alternate negative || 0 || 10 || 0 || 0.05  
|-
|-
| Reals || Proximal || positive || 1 || 2 || 0.5 || 0.5  
| Reals || Proximal || positive || 1 || 2 || 0.5 || 0.5 ± 0.5 (-+1,++0)
|-
|-
| Randoms || Proximal || arbitrary positive || 0 || 10 || 0 || 0  
| Randoms || Proximal || arbitrary positive || 0 || 10 || 0 || 0  
Line 499: Line 499:
! Reals or randoms !! Promoters !! direction !! Numbers !! Strands !! Occurrences !! Averages (± 0.1)  
! Reals or randoms !! Promoters !! direction !! Numbers !! Strands !! Occurrences !! Averages (± 0.1)  
|-
|-
| Reals || UTR || negative || 1 || 2 || 0.5 || 0.5  
| Reals || UTR || negative || 1 || 2 || 0.5 || 0.5 ± 0.5 (--0,+-1)
|-
|-
| Randoms || UTR || arbitrary negative || 7 || 10 || 0.7 || 0.55  
| Randoms || UTR || arbitrary negative || 7 || 10 || 0.7 || 0.55 ± 0.15
|-
|-
| Randoms || UTR || alternate negative || 4 || 10 || 0.4 || 0.55  
| Randoms || UTR || alternate negative || 4 || 10 || 0.4 || 0.55 ± 0.15
|-
|-
| Reals || Core || negative || 0 || 2 || 0 || 0  
| Reals || Core || negative || 0 || 2 || 0 || 0  
Line 513: Line 513:
| Reals || Core || positive || 0 || 2 || 0 || 0  
| Reals || Core || positive || 0 || 2 || 0 || 0  
|-
|-
| Randoms || Core || arbitrary positive || 0 || 10 || 0 || 0.05
| Randoms || Core || arbitrary positive || 0 || 10 || 0 || 0.05 ± 0.05
|-
|-
| Randoms || Core || alternate positive || 1 || 10 || 0.1 || 0.05  
| Randoms || Core || alternate positive || 1 || 10 || 0.1 || 0.05 ± 0.05  
|-
|-
| Reals || Proximal || negative || 0 || 2 || 0 || 0  
| Reals || Proximal || negative || 0 || 2 || 0 || 0  
|-
|-
| Randoms || Proximal || arbitrary negative || 1 || 10 || 0.1 || 0.15  
| Randoms || Proximal || arbitrary negative || 1 || 10 || 0.1 || 0.15 ± 0.05
|-
|-
| Randoms || Proximal || alternate negative || 2 || 10 || 0.2 || 0.15  
| Randoms || Proximal || alternate negative || 2 || 10 || 0.2 || 0.15 ± 0.05
|-
|-
| Reals || Proximal || positive || 1 || 2 || 0.5 || 0.5  
| Reals || Proximal || positive || 1 || 2 || 0.5 || 0.5 ± 0.5 (-+1,++0)
|-
|-
| Randoms || Proximal || arbitrary positive || 0 || 10 || 0 || 0.05  
| Randoms || Proximal || arbitrary positive || 0 || 10 || 0 || 0.05 ± 0.05  
|-
|-
| Randoms || Proximal || alternate positive || 1 || 10 || 0.1 || 0.05  
| Randoms || Proximal || alternate positive || 1 || 10 || 0.1 || 0.05 ± 0.05  
|-
|-
| Reals || Distal || negative || 3 || 2 || 1.5 || 1.5 ± 0.5 (--2,+-1)  
| Reals || Distal || negative || 3 || 2 || 1.5 || 1.5 ± 0.5 (--2,+-1)  
|-
|-
| Randoms || Distal || arbitrary negative || 4 || 10 || 0.4 || 0.75
| Randoms || Distal || arbitrary negative || 4 || 10 || 0.4 || 0.75 ± 0.35
|-
|-
| Randoms || Distal || alternate negative || 11 || 10 || 1.1 || 0.75  
| Randoms || Distal || alternate negative || 11 || 10 || 1.1 || 0.75 ± 0.35
|-
|-
| Reals || Distal || positive || 6 || 2 || 3 || 3  
| Reals || Distal || positive || 6 || 2 || 3 || 3 ± 1 (-+4,++2)
|-
|-
| Randoms || Distal || arbitrary positive || 17 || 10 || 1.7 || 1.3  
| Randoms || Distal || arbitrary positive || 17 || 10 || 1.7 || 1.3 ± 0.4
|-
|-
| Randoms || Distal || alternate positive || 9 || 10 || 0.9 || 1.3  
| Randoms || Distal || alternate positive || 9 || 10 || 0.9 || 1.3 ± 0.4
|}
|}


Comparison:
Comparison:


The occurrences of real SP1M2 UTRs are within the randoms, the proximals are greater than the randoms, negative distals overlap the high end randoms, positive distals are greater than the randoms. This suggests that the real SP1M2s are likely active or activable.
The occurrences of real SP1M2 UTRs, proximals and positive distals are greater than the randoms, negative distals overlap the high randoms. This suggests that the real SP1M2s are likely active or activable.


==Sp-1 (Sato) samplings==
==Sp-1 (Sato) samplings==
Line 664: Line 664:


==SP1S analysis and results==
==SP1S analysis and results==
{{main|Complex locus A1BG and ZNF497#SP1Ss}}
{{main|Complex locus A1BG and ZNF497#SP-1 (Sato)s}}
Sp-1 (CCGCCCC).<ref name=Sato/>
Sp-1 (CCGCCCC).<ref name=Sato/>


Line 673: Line 673:
| Reals || UTR || negative || 0 || 2 || 0 || 0  
| Reals || UTR || negative || 0 || 2 || 0 || 0  
|-
|-
| Randoms || UTR || arbitrary negative || 0 || 10 || 0 || 0  
| Randoms || UTR || arbitrary negative || 5 || 10 || 0.5 || 0.3
|-
|-
| Randoms || UTR || alternate negative || 0 || 10 || 0 || 0  
| Randoms || UTR || alternate negative || 1 || 10 || 0.1 || 0.3
|-
|-
| Reals || Core || negative || 0 || 2 || 0 || 0  
| Reals || Core || negative || 0 || 2 || 0 || 0  
|-
|-
| Randoms || Core || arbitrary negative || 0 || 10 || 0 || 0
| Randoms || Core || arbitrary negative || 1 || 10 || 0.1 || 0.05
|-
|-
| Randoms || Core || alternate negative || 0 || 10 || 0 || 0  
| Randoms || Core || alternate negative || 0 || 10 || 0 || 0.05
|-
|-
| Reals || Core || positive || 2 || 2 || 1 || 1  
| Reals || Core || positive || 2 || 2 || 1 || 1 ± 1 (-+2,++0)
|-
|-
| Randoms || Core || arbitrary positive || 0 || 10 || 0 || 0
| Randoms || Core || arbitrary positive || 0 || 10 || 0 || 0.1
|-
|-
| Randoms || Core || alternate positive || 0 || 10 || 0 || 0  
| Randoms || Core || alternate positive || 2 || 10 || 0.2 || 0.1
|-
|-
| Reals || Proximal || negative || 0 || 2 || 0 || 0  
| Reals || Proximal || negative || 0 || 2 || 0 || 0  
|-
|-
| Randoms || Proximal || arbitrary negative || 0 || 10 || 0 || 0  
| Randoms || Proximal || arbitrary negative || 0 || 10 || 0 || 0.1
|-
|-
| Randoms || Proximal || alternate negative || 0 || 10 || 0 || 0  
| Randoms || Proximal || alternate negative || 2 || 10 || 0.2 || 0.1
|-
|-
| Reals || Proximal || positive || 1 || 2 || 0.5 || 0.5  
| Reals || Proximal || positive || 1 || 2 || 0.5 || 0.5 ± 0.5 (-+0,++2)
|-
|-
| Randoms || Proximal || arbitrary positive || 0 || 10 || 0 || 0  
| Randoms || Proximal || arbitrary positive || 1 || 10 || 0.1 || 0.1
|-
|-
| Randoms || Proximal || alternate positive || 0 || 10 || 0 || 0  
| Randoms || Proximal || alternate positive || 1 || 10 || 0.1 || 0.1
|-
|-
| Reals || Distal || negative || 0 || 2 || 0 || 0  
| Reals || Distal || negative || 0 || 2 || 0 || 0  
|-
|-
| Randoms || Distal || arbitrary negative || 0 || 10 || 0 || 0
| Randoms || Distal || arbitrary negative || 2 || 10 || 0.2 || 0.3
|-
|-
| Randoms || Distal || alternate negative || 0 || 10 || 0 || 0  
| Randoms || Distal || alternate negative || 4 || 10 || 0.4 || 0.3
|-
|-
| Reals || Distal || positive || 3 || 2 || 1.5 || 1.5 ± 0.5 (-+2,++1)  
| Reals || Distal || positive || 3 || 2 || 1.5 || 1.5 ± 0.5 (-+2,++1)  
|-
|-
| Randoms || Distal || arbitrary positive || 0 || 10 || 0 || 0  
| Randoms || Distal || arbitrary positive || 6 || 10 || 0.6 || 0.5
|-
|-
| Randoms || Distal || alternate positive || 0 || 10 || 0 || 0  
| Randoms || Distal || alternate positive || 4 || 10 || 0.4 || 0.5
|}
|}


Line 719: Line 719:


==Sp1 (Yao) samplings==
==Sp1 (Yao) samplings==
{{main|Model samplings}}
 
Copying a responsive elements consensus sequence GCGGC and putting the sequence in "⌘F" finds 21 between ZNF497 and A1BG or one between ZSCAN22 and A1BG as can be found by the computer programs.
Copying a responsive elements consensus sequence GCGGC and putting the sequence in "⌘F" finds 21 between ZNF497 and A1BG or one between ZSCAN22 and A1BG as can be found by the computer programs.


Line 740: Line 740:
# inverse negative strand, positive direction, looking for CGGCG, 9, CGGCG at 1918, CGGCG at 1583, CGGCG at 1548, CGGCG at 1296, CGGCG at 1212, CGGCG at 1044, CGGCG at 638, CGGCG at 540, CGGCG at 355.
# inverse negative strand, positive direction, looking for CGGCG, 9, CGGCG at 1918, CGGCG at 1583, CGGCG at 1548, CGGCG at 1296, CGGCG at 1212, CGGCG at 1044, CGGCG at 638, CGGCG at 540, CGGCG at 355.


===SP1Y proximal promoters===
===SP1Y negative direction (2811-2596) proximal promoters===
{{main|Proximal promoter gene transcriptions}}
 
Negative strand, negative direction: GCCGC at 2726.
# Negative strand, negative direction: GCCGC at 2726.
# Positive strand, negative direction: GCGGC at 2725.
 
===SP1Y negative direction (2596-1) distal promoters===
 
# Negative strand, negative direction: GCGGC at 1154.
# Positive strand, negative direction: GCGGC at 1753, GCGGC at 957.
 
===SP1Y positive direction (4050-1) distal promoters===
 
# Negative strand, positive direction: GCGGC at 1902, GCGGC at 1794, GCGGC at 1637, GCGGC at 1582, GCGGC at 1438, GCGGC at 1423, GCGGC at 1338, GCGGC at 1323, GCGGC at 1255, GCGGC at 1171, GCGGC at 1148, GCGGC at 1034, GCGGC at 1003, GCGGC at 751, GCGGC at 721, GCGGC at 667, GCGGC at 637, GCGGC at 583, GCGGC at 499, GCGGC at 354, GCGGC at 332.
# Negative strand, positive direction: GCCGC at 3226, GCCGC at 2355, GCCGC at 1756, GCCGC at 1648, GCCGC at 903.
# Positive strand, positive direction: GCGGC at 1758, GCGGC at 1163, GCGGC at 1079.
# Positive strand, positive direction: GCCGC at 1918, GCCGC at 1583, GCCGC at 1548, GCCGC at 1296, GCCGC at 1212, GCCGC at 1044, GCCGC at 638, GCCGC at 540, GCCGC at 355.
 
==SP1Y random dataset samplings==
 
# SP1Yr0: 3, GCGGC at 3547, GCGGC at 3444, GCGGC at 3112.
# SP1Yr1: 3, GCGGC at 3801, GCGGC at 3104, GCGGC at 1913.
# SP1Yr2: 4, GCGGC at 4348, GCGGC at 3749, GCGGC at 3670, GCGGC at 2449.
# SP1Yr3: 4, GCGGC at 3413, GCGGC at 2717, GCGGC at 1764, GCGGC at 1442.
# SP1Yr4: 5, GCGGC at 4109, GCGGC at 3556, GCGGC at 2614, GCGGC at 1582, GCGGC at 218.
# SP1Yr5: 7, GCGGC at 3752, GCGGC at 2932, GCGGC at 2166, GCGGC at 1700, GCGGC at 685, GCGGC at 678, GCGGC at 593.
# SP1Yr6: 5, GCGGC at 4434, GCGGC at 4387, GCGGC at 3426, GCGGC at 1665, GCGGC at 1561.
# SP1Yr7: 2, GCGGC at 3615, GCGGC at 774.
# SP1Yr8: 4, GCGGC at 4280, GCGGC at 3286, GCGGC at 2028, GCGGC at 713.
# SP1Yr9: 8, GCGGC at 3896, GCGGC at 3893, GCGGC at 3628, GCGGC at 3536, GCGGC at 2069, GCGGC at 1727, GCGGC at 1427, GCGGC at 887.
# SP1Yr0ci: 5, GCCGC at 3406, GCCGC at 3241, GCCGC at 2379, GCCGC at 1383, GCCGC at 370.
# SP1Yr1ci: 2, GCCGC at 4114, GCCGC at 1060.
# SP1Yr2ci: 4, GCCGC at 3585, GCCGC at 2597, GCCGC at 1965, GCCGC at 1816.
# SP1Yr3ci: 4, GCCGC at 4137, GCCGC at 2791, GCCGC at 2375, GCCGC at 1451.
# SP1Yr4ci: 7, GCCGC at 2344, GCCGC at 1091, GCCGC at 1088, GCCGC at 1021, GCCGC at 586, GCCGC at 79, GCCGC at 16.
# SP1Yr5ci: 5, GCCGC at 4438, GCCGC at 4352, GCCGC at 3323, GCCGC at 2758, GCCGC at 596.
# SP1Yr6ci: 3, GCCGC at 2249, GCCGC at 1279, GCCGC at 1219.
# SP1Yr7ci: 6, GCCGC at 4386, GCCGC at 4213, GCCGC at 3954, GCCGC at 1910, GCCGC at 1769, GCCGC at 656.
# SP1Yr8ci: 4, GCCGC at 2517, GCCGC at 2472, GCCGC at 760, GCCGC at 134.
# SP1Yr9ci: 6, GCCGC at 4309, GCCGC at 3883, GCCGC at 2665, GCCGC at 2448, GCCGC at 1414, GCCGC at 568.
 
===SP1Yr arbitrary (evens) (4560-2846) UTRs===
 
# SP1Yr0: GCGGC at 3547, GCGGC at 3444, GCGGC at 3112.
# SP1Yr2: GCGGC at 4348, GCGGC at 3749, GCGGC at 3670.
# SP1Yr4: GCGGC at 4109, GCGGC at 3556.
# SP1Yr6: GCGGC at 4434, GCGGC at 4387, GCGGC at 3426.
# SP1Yr8: GCGGC at 4280, GCGGC at 3286.
# SP1Yr0ci: GCCGC at 3406, GCCGC at 3241.
# SP1Yr2ci: GCCGC at 3585.
 
===SP1Yr alternate (odds) (4560-2846) UTRs===
 
# SP1Yr1: GCGGC at 3801, GCGGC at 3104.
# SP1Yr3: GCGGC at 3413.
# SP1Yr5: GCGGC at 3752, GCGGC at 2932.
# SP1Yr7: GCGGC at 3615.
# SP1Yr9: GCGGC at 3896, GCGGC at 3893, GCGGC at 3628, GCGGC at 3536.
# SP1Yr1ci: GCCGC at 4114.
# SP1Yr3ci: GCCGC at 4137.
# SP1Yr5ci: GCCGC at 4438, GCCGC at 4352, GCCGC at 3323.
# SP1Yr7ci: GCCGC at 4386, GCCGC at 4213, GCCGC at 3954.
# SP1Yr9ci: GCCGC at 4309, GCCGC at 3883.
 
===SP1Yr arbitrary positive direction (odds) (4445-4265) core promoters===


Positive strand, negative direction: GCGGC at 2725.
# SP1Yr5ci: GCCGC at 4438, GCCGC at 4352.
# SP1Yr7ci: GCCGC at 4386.
# SP1Yr9ci: GCCGC at 4309.


===SP1Y distal promoters===
===SP1Yr alternate positive direction (evens) (4445-4265) core promoters===
{{main|Distal promoter gene transcriptions}}
Negative strand, negative direction: GCGGC at 1154.


Positive strand, negative direction: GCGGC at 1753, GCGGC at 957.
# SP1Yr2: GCGGC at 4348.
# SP1Yr6: GCGGC at 4434, GCGGC at 4387.
# SP1Yr8: GCGGC at 4280.


Negative strand, positive direction: GCCGC at 3226, GCCGC at 2355, GCGGC at 1902, GCGGC at 1794, GCCGC at 1756, GCCGC at 1648, GCGGC at 1637, GCGGC at 1582, GCGGC at 1438, GCGGC at 1423, GCGGC at 1338, GCGGC at 1323, GCGGC at 1255, GCGGC at 1171, GCGGC at 1148, GCGGC at 1034, GCGGC at 1003, GCCGC at 903, GCGGC at 751, GCGGC at 721, GCGGC at 667, GCGGC at 637, GCGGC at 583, GCGGC at 499, GCGGC at 354, GCGGC at 332.
===SP1Yr arbitrary negative direction (evens) (2811-2596) proximal promoters===
 
# SP1Yr4: GCGGC at 2614.
# SP1Yr2ci: GCCGC at 2597.
 
===SP1Yr alternate negative direction (odds) (2811-2596) proximal promoters===
 
# SP1Yr3: GCGGC at 2717.
# SP1Yr3ci: GCCGC at 2791.
# SP1Yr5ci: GCCGC at 2758.
# SP1Yr9ci: GCCGC at 2665.
 
===SP1Yr arbitrary positive direction (odds) (4265-4050) proximal promoters===
 
# SP1Yr1ci: GCCGC at 4114.
# SP1Yr3ci: GCCGC at 4137.
# SP1Yr7ci: GCCGC at 4213.
 
===SP1Yr alternate positive direction (evens) (4265-4050) proximal promoters===
 
# SP1Yr4: GCGGC at 4109.
 
===SP1Yr arbitrary negative direction (evens) (2596-1) distal promoters===
 
# SP1Yr2: GCGGC at 2449.
# SP1Yr4: GCGGC at 1582, GCGGC at 218.
# SP1Yr6: GCGGC at 1665, GCGGC at 1561.
# SP1Yr8: GCGGC at 2028, GCGGC at 713.
# SP1Yr0ci: GCCGC at 2379, GCCGC at 1383, GCCGC at 370.
# SP1Yr2ci: GCCGC at 1965, GCCGC at 1816.
# SP1Yr6ci: GCCGC at 2249, GCCGC at 1279, GCCGC at 1219.
# SP1Yr8ci: GCCGC at 2517, GCCGC at 2472, GCCGC at 760, GCCGC at 134.
 
===SP1Yr alternate negative direction (odds) (2596-1) distal promoters===
 
# SP1Yr1: GCGGC at 1913.
# SP1Yr3: GCGGC at 1764, GCGGC at 1442.
# SP1Yr5: GCGGC at 2166, GCGGC at 1700, GCGGC at 685, GCGGC at 678, GCGGC at 593.
# SP1Yr7: GCGGC at 774.
# SP1Yr9: GCGGC at 2069, GCGGC at 1727, GCGGC at 1427, GCGGC at 887.
# SP1Yr1ci: GCCGC at 1060.
# SP1Yr3ci: GCCGC at 2375, GCCGC at 1451.
# SP1Yr5ci: GCCGC at 596.
# SP1Yr7ci: GCCGC at 1910, GCCGC at 1769, GCCGC at 656.
# SP1Yr9ci: GCCGC at 2448, GCCGC at 1414, GCCGC at 568.
 
===SP1Yr arbitrary positive direction (odds) (4050-1) distal promoters===
 
# SP1Yr1: GCGGC at 3801, GCGGC at 3104, GCGGC at 1913.
# SP1Yr3: GCGGC at 3413, GCGGC at 2717, GCGGC at 1764, GCGGC at 1442.
# SP1Yr5: GCGGC at 3752, GCGGC at 2932, GCGGC at 2166, GCGGC at 1700, GCGGC at 685, GCGGC at 678, GCGGC at 593.
# SP1Yr7: GCGGC at 3615, GCGGC at 774.
# SP1Yr9: GCGGC at 3896, GCGGC at 3893, GCGGC at 3628, GCGGC at 3536, GCGGC at 2069, GCGGC at 1727, GCGGC at 1427, GCGGC at 887.
# SP1Yr1ci: GCCGC at 1060.
# SP1Yr3ci: GCCGC at 2791, GCCGC at 2375, GCCGC at 1451.
# SP1Yr5ci: GCCGC at 3323, GCCGC at 2758, GCCGC at 596.
# SP1Yr7ci: GCCGC at 3954, GCCGC at 1910, GCCGC at 1769, GCCGC at 656.
# SP1Yr9ci: GCCGC at 3883, GCCGC at 2665, GCCGC at 2448, GCCGC at 1414, GCCGC at 568.
 
===SP1Yr alternate positive direction (evens) (4050-1) distal promoters===
 
# SP1Yr2: GCGGC at 3749, GCGGC at 3670, GCGGC at 2449.
# SP1Yr4: GCGGC at 3556, GCGGC at 2614, GCGGC at 1582, GCGGC at 218.
# SP1Yr6: GCGGC at 3426, GCGGC at 1665, GCGGC at 1561.
# SP1Yr8: GCGGC at 3286, GCGGC at 2028, GCGGC at 713.
# SP1Yr0ci: GCCGC at 3406, GCCGC at 3241, GCCGC at 2379, GCCGC at 1383, GCCGC at 370.
# SP1Yr2ci: GCCGC at 3585, GCCGC at 2597, GCCGC at 1965, GCCGC at 1816.
# SP1Yr6ci: GCCGC at 2249, GCCGC at 1279, GCCGC at 1219.
# SP1Yr8ci: GCCGC at 2517, GCCGC at 2472, GCCGC at 760, GCCGC at 134.
 
==SP1Y analysis and results==
{{main|Complex locus A1BG and ZNF497#SP1 (Yao)s}}
Sp1 (GCGGC).<ref name=Yao2016/>
 
{|class="wikitable"
|-
! Reals or randoms !! Promoters !! direction !! Numbers !! Strands !! Occurrences !! Averages (± 0.1)
|-
| Reals || UTR || negative || 0 || 2 || 0 || 0
|-
| Randoms || UTR || arbitrary negative || 16 || 10 || 1.6 || 1.8
|-
| Randoms || UTR || alternate negative || 20 || 10 || 2.0 || 1.8
|-
| Reals || Core || negative || 0 || 2 || 0 || 0
|-
| Randoms || Core || arbitrary negative || 0 || 10 || 0 || 0
|-
| Randoms || Core || alternate negative || 0 || 10 || 0 || 0
|-
| Reals || Core || positive || 0 || 2 || 0 || 0
|-
| Randoms || Core || arbitrary positive || 4 || 10 || 0.4 || 0
|-
| Randoms || Core || alternate positive || 4 || 10 || 0.4 || 0
|-
| Reals || Proximal || negative || 2 || 2 || 1 || 1 ± 0 (--1,+-1)
|-
| Randoms || Proximal || arbitrary negative || 2 || 10 || 0.2 || 0.3
|-
| Randoms || Proximal || alternate negative || 4 || 10 || 0.4 || 0.3
|-
| Reals || Proximal || positive || 0 || 2 || 0 || 0
|-
| Randoms || Proximal || arbitrary positive || 3 || 10 || 0.3 || 0.2
|-
| Randoms || Proximal || alternate positive || 1 || 10 || 0.1 || 0.2
|-
| Reals || Distal || negative || 3 || 2 || 1.5 || 1.5 ± 0.5 (--1,+-2)
|-
| Randoms || Distal || arbitrary negative || 19 || 10 || 1.9 || 2.1
|-
| Randoms || Distal || alternate negative || 23 || 10 || 2.3 || 2.1
|-
| Reals || Distal || positive || 38 || 2 || 19 || 19 ± 7 (-+26,++12)
|-
| Randoms || Distal || arbitrary positive || 40 || 10 || 4.0 || 3.45
|-
| Randoms || Distal || alternate positive || 29 || 10 || 2.9 || 3.45
|}
 
Comparison:


Positive strand, positive direction: GCCGC at 1918, GCGGC at 1758, GCCGC at 1583, GCCGC at 1548, GCCGC at 1296, GCCGC at 1212, GCGGC at 1163, GCGGC at 1079, GCCGC at 1044, GCCGC at 638, GCCGC at 540, GCCGC at 355.
The occurrences of real SP1Y negative proximals and positive distals are greater than the randoms, negative distals overlap low randoms. This suggests that the real SP1Ys are likely active or activable.


==GC box (Zhang)==
==GC box (Zhang) samplings==


Consensus sequence is (G/T)GGGCGG(A/G)(A/G)(C/T).<ref name=Zhang/>
Consensus sequence is (G/T)GGGCGG(A/G)(A/G)(C/T).<ref name=Zhang/>
Line 819: Line 1,003:


==GC box (Zhang) analysis and results==
==GC box (Zhang) analysis and results==
{{main|Complex locus A1BG and ZNF497#Name of response elements}}
{{main|Complex locus A1BG and ZNF497#GC boxes (Zhang)}}


Consensus sequence is (G/T)GGGCGG(A/G)(A/G)(C/T).<ref name=Zhang/>
Consensus sequence is (G/T)GGGCGG(A/G)(A/G)(C/T).<ref name=Zhang/>
Line 827: Line 1,011:
! Reals or randoms !! Promoters !! direction !! Numbers !! Strands !! Occurrences !! Averages (± 0.1)  
! Reals or randoms !! Promoters !! direction !! Numbers !! Strands !! Occurrences !! Averages (± 0.1)  
|-
|-
| Reals || UTR || negative || 1 || 2 || 0.5 || 0.5  
| Reals || UTR || negative || 1 || 2 || 0.5 || 0.5 ± 0.5 (--1,+-0)
|-
|-
| Randoms || UTR || arbitrary negative || 0 || 10 || 0 || 0  
| Randoms || UTR || arbitrary negative || 0 || 10 || 0 || 0  
Line 863: Line 1,047:
| Randoms || Distal || alternate negative || 1 || 10 || 0.1 || 0.15  
| Randoms || Distal || alternate negative || 1 || 10 || 0.1 || 0.15  
|-
|-
| Reals || Distal || positive || 1 || 2 || 0.5 || 0.5  
| Reals || Distal || positive || 1 || 2 || 0.5 || 0.5 ± 0.5 (-+1,++0)
|-
|-
| Randoms || Distal || arbitrary positive || 1 || 10 || 0.1 || 0.15  
| Randoms || Distal || arbitrary positive || 1 || 10 || 0.1 || 0.15  

Latest revision as of 16:55, 8 September 2023

Editor-In-Chief: Henry A. Hoff

File:Ashwagandha.jpg
Withania somnifera produces Withaferin A, a sterodial lactone known to inhibit Sp1 transcription factor. Credit: Hari Prasad Nadig from Bangalore, India.{{free media}}

Specificity protein 1 is Sp1.

Sp1 has been used as a control protein to compare with when studying the increase or decrease of the aryl hydrocarbon receptor and/or the estrogen receptor, since it binds to both and generally remains at a relatively constant level.[1]

Withaferin A, a sterodial lactone from Withania somnifera is known to inhibit Sp1 transcription factor.[2]

"Specific protein 1(SP1) is a member of the SP / Kruppel-like factor (KLF) transcription factor family, characterized by the presence of three conserved Cys2-His2-type zinc finger DNA binding domains in its C-terminus [17–19]."[3]

Consensus sequences

"SP1 binds to the GC box comprising a consensus sequence 5′-(G/T) GGGCGG (G/A) (G/A) (C/T)-3′ via the zinc finger motifs to transactivate gene expression, and about 12,000 SP1 binding sites are found in the human genome [17, 18]."[3]

Sp1-box 1 (GGGGCT) and Sp1-box 2 (CTGCCC).[4]

"Sp3 has been shown to repress transcriptional activity of Sp1 [9]."[4]

Sp-1 (CCGCCCC).[5]

Sp1 (GCGGC).[6]

An apparent consensus sequences for Sp1 (GGGGCT), (CTGCCC) or (CCGCCCC) is 3'-(C/G)(C/G/T)G(C/G)C(C/T)-5'. Or, each must be considered separately.

Human genes

GeneID: 6667 is SP1 Sp1 transcription factor. "The protein encoded by this gene is a zinc finger transcription factor that binds to GC-rich motifs of many promoters. The encoded protein is involved in many cellular processes, including cell differentiation, cell growth, apoptosis, immune responses, response to DNA damage, and chromatin remodeling. Post-translational modifications such as phosphorylation, acetylation, glycosylation, and proteolytic processing significantly affect the activity of this protein, which can be an activator or a repressor. Three transcript variants encoding different isoforms have been found for this gene."[7]

  1. NP_612482.2 transcription factor Sp1 isoform a.
  2. NP_003100.1 transcription factor Sp1 isoform b.
  3. NP_001238754.1 transcription factor Sp1 isoform c.

GeneID: 6668 is SP2 Sp2 transcription factor. "This gene encodes a member of the Sp subfamily of Sp/XKLF transcription factors. Sp family proteins are sequence-specific DNA-binding proteins characterized by an amino-terminal trans-activation domain and three carboxy-terminal zinc finger motifs. This protein contains the least conserved DNA-binding domain within the Sp subfamily of proteins, and its DNA sequence specificity differs from the other Sp proteins. It localizes primarily within subnuclear foci associated with the nuclear matrix, and can activate or in some cases repress expression from different promoters."[8]

  1. NP_003101.3 transcription factor Sp2.

Gene ID: 6670 is SP3 Sp3 transcription factor. "This gene belongs to a family of Sp1 related genes that encode transcription factors that regulate transcription by binding to consensus GC- and GT-box regulatory elements in target genes. This protein contains a zinc finger DNA-binding domain and several transactivation domains, and has been reported to function as a bifunctional transcription factor that either stimulates or represses the transcription of numerous genes. Transcript variants encoding different isoforms have been described for this gene, and one has been reported to initiate translation from a non-AUG (AUA) start codon. Additional isoforms, resulting from the use of alternate downstream translation initiation sites, have also been noted. A related pseudogene has been identified on chromosome 13."[9]

  1. NP_003102.1 transcription factor Sp3 isoform 1.
  2. NP_001017371.3 transcription factor Sp3 isoform 2.
  3. NP_001166183.1 transcription factor Sp3 isoform 3.

GeneID: 6671 is SP4 Sp4 transcription factor. "The protein encoded by this gene is a transcription factor that can bind to the GC promoter region of a variety of genes, including those of the photoreceptor signal transduction system. The encoded protein binds to the same sites in promoter CpG islands as does the transcription factor SP1, although its expression is much more restricted compared to that of SP1. This gene may be involved in bipolar disorder and schizophrenia."[10]

  1. NP_003103.2 transcription factor Sp4 isoform 1.
  2. NP_001313471.1 transcription factor Sp4 isoform 2.
  3. NP_001313472.1 transcription factor Sp4 isoform 3.
  4. NR_137166.1 RNA Sequence non-coding (variant 4).

GeneID: 389058 is SP5 Sp5 transcription factor.

  1. NP_001003845.1 transcription factor Sp5.

GeneID: 80320 is SP6 Sp6 transcription factor (aka KLF14). "SP6 belongs to a family of transcription factors that contain 3 classical zinc finger DNA-binding domains consisting of a zinc atom tetrahedrally coordinated by 2 cysteines and 2 histidines (C2H2 motif). These transcription factors bind to GC-rich sequences and related GT and CACCC boxes."[11]

  1. NP_001245177.1 transcription factor Sp6 (variant 1).
  2. NP_954871.1 transcription factor Sp6 (variant 2).

GeneID: 121340 is SP7 Sp7 transcription factor. "This gene encodes a member of the Sp subfamily of Sp/XKLF transcription factors. Sp family proteins are sequence-specific DNA-binding proteins characterized by an amino-terminal trans-activation domain and three carboxy-terminal zinc finger motifs. This protein is a bone specific transcription factor and is required for osteoblast differentiation and bone formation."[12]

  1. NP_001166938.1 transcription factor Sp7 isoform a.
  2. NP_690599.1 transcription factor Sp7 isoform a.
  3. NP_001287766.1 transcription factor Sp7 isoform b.

GeneID: 221833 is SP8 Sp8 transcription factor. "The protein encoded by this gene is an SP family transcription factor that in mouse has been shown to be essential for proper limb development. Two transcript variants encoding different isoforms have been found for this gene."[13]

  1. NP_874359.2 transcription factor Sp8 isoform 1.
  2. NP_945194.1 transcription factor Sp8 isoform 2.

GeneID: 100131390 is SP9 Sp9 transcription factor.

Sp/KLF family

The Sp/KLF family (specificity protein/Krüppel-like factor) is a family of transcription factors,[14] including the Kruppel-like factors as well as Sp1 transcription factor, Sp2 transcription factor, Sp3 transcription factor,[15][16] Sp4 transcription factor,[17] Sp8 transcription factor,[18] Sp9;[19] and possibly Sp5[20] and Sp7 transcription factor.[18] KLF14 is also designated Sp6.[21]

Sp1-box 1 (Motojima) Samplings

Copying the apparent consensus sequences for Sp1 (GGGGCT)and putting each sequence in "⌘F" finds none located between ZSCAN22 and A1BG and four between ZNF497 and A1BG as can be found by the computer programs.

For the Basic programs testing consensus sequence GGGGCT (starting with SuccessablesSP1M.bas) written to compare nucleotide sequences with the sequences on either the template strand (-), or coding strand (+), of the DNA, in the negative direction (-), or the positive direction (+), the programs are, are looking for, and found:

  1. negative strand, negative direction, looking for GGGGCT, 0.
  2. positive strand, negative direction, looking for GGGGCT, 1, GGGGCT at 3039.
  3. negative strand, positive direction, looking for GGGGCT, 4, GGGGCT at 3983, GGGGCT at 1029, GGGGCT at 415, GGGGCT at 262.
  4. positive strand, positive direction, looking for GGGGCT, 1, GGGGCT at 576.
  5. complement, negative strand, negative direction, looking for CCCCGA, 1, CCCCGA at 3039.
  6. complement, positive strand, negative direction, looking for CCCCGA, 0.
  7. complement, positive strand, positive direction, looking for CCCCGA, 4, CCCCGA at 3983, CCCCGA at 1029, CCCCGA at 415, CCCCGA at 262.
  8. complement, negative strand, positive direction, looking for CCCCGA, 1, CCCCGA at 576.
  9. inverse complement, negative strand, negative direction, looking for AGCCCC, 1, AGCCCC at 3037.
  10. inverse complement, positive strand, negative direction, looking for AGCCCC, 0.
  11. inverse complement, positive strand, positive direction, looking for AGCCCC, 6, AGCCCC at 4426, AGCCCC at 4411, AGCCCC at 1955, AGCCCC at 1866, AGCCCC at 349, AGCCCC at 279.
  12. inverse complement, negative strand, positive direction, looking for AGCCCC, 1, AGCCCC at 4218.
  13. inverse negative strand, negative direction, looking for CTGGGG, 0.
  14. inverse positive strand, negative direction, looking for CTGGGG, 1, CTGGGG at 3037.
  15. inverse positive strand, positive direction, looking for CTGGGG, 1, CTGGGG at 4218.
  16. inverse negative strand, positive direction, looking for CTGGGG, 6, CTGGGG at 4426, CTGGGG at 4411, CTGGGG at 1955, CTGGGG at 1866, CTGGGG at 349, CTGGGG at 279.

SP1M1 (4560-2846) UTRs

  1. Negative strand, negative direction: AGCCCC at 3037.
  2. Positive strand, negative direction: GGGGCT at 3039.

SP1M1 positive direction (4445-4265) core promoters

Positive strand, positive direction: AGCCCC at 4426, AGCCCC at 4411.

SP1M1 positive direction (4265-4050) proximal promoters

  1. Negative strand, positive direction: AGCCCC at 4218.

SP1M1 positive direction (4050-1) distal promoters

  1. Negative strand, positive direction: GGGGCT at 3983, GGGGCT at 1029, GGGGCT at 415, GGGGCT at 262.
  2. Positive strand, positive direction: GGGGCT at 576.
  3. Positive strand, positive direction: AGCCCC at 1955, AGCCCC at 1866, AGCCCC at 349, AGCCCC at 279.

SP1M1 random dataset samplings

  1. SP1M1r0: 1, GGGGCT at 2467.
  2. SP1M1r1: 3, GGGGCT at 1988, GGGGCT at 822, GGGGCT at 66.
  3. SP1M1r2: 0.
  4. SP1M1r3: 4, GGGGCT at 3636, GGGGCT at 3404, GGGGCT at 1923, GGGGCT at 1549.
  5. SP1M1r4: 2, GGGGCT at 1992, GGGGCT at 1469.
  6. SP1M1r5: 1, GGGGCT at 771.
  7. SP1M1r6: 2, GGGGCT at 3276, GGGGCT at 112.
  8. SP1M1r7: 2, GGGGCT at 1378, GGGGCT at 914.
  9. SP1M1r8: 2, GGGGCT at 3320, GGGGCT at 2825.
  10. SP1M1r9: 2, GGGGCT at 2858, GGGGCT at 1717.
  11. SP1M1r0ci: 0.
  12. SP1M1r1ci: 1, AGCCCC at 230.
  13. SP1M1r2ci: 2, AGCCCC at 2564, AGCCCC at 952.
  14. SP1M1r3ci: 2, AGCCCC at 108, AGCCCC at 52.
  15. SP1M1r4ci: 1, AGCCCC at 3228.
  16. SP1M1r5ci: 3, AGCCCC at 3137, AGCCCC at 1634, AGCCCC at 984.
  17. SP1M1r6ci: 3, AGCCCC at 3386, AGCCCC at 2451, AGCCCC at 681.
  18. SP1M1r7ci: 3, AGCCCC at 2175, AGCCCC at 1840, AGCCCC at 270.
  19. SP1M1r8ci: 3, AGCCCC at 4559, AGCCCC at 2720, AGCCCC at 904.
  20. SP1M1r9ci: 1, AGCCCC at 420.

SP1M1r arbitrary (evens) (4560-2846) UTRs

  1. SP1M1r6: GGGGCT at 3276.
  2. SP1M1r8: GGGGCT at 3320.
  3. SP1M1r4ci: AGCCCC at 3228.
  4. SP1M1r8ci: AGCCCC at 4559.

SP1M1r alternate (odds) (4560-2846) UTRs

  1. SP1M1r3: GGGGCT at 3636, GGGGCT at 3404.
  2. SP1M1r9: GGGGCT at 2858.
  3. SP1M1r5ci: AGCCCC at 3137.

SP1M1r arbitrary negative direction (evens) (2846-2811) core promoters

  1. SP1M1r8: GGGGCT at 2825.

SP1M1r arbitrary negative direction (evens) (2811-2596) proximal promoters

  1. SP1M1r8ci: AGCCCC at 2720.

SP1M1r arbitrary negative direction (evens) (2596-1) distal promoters

  1. SP1M1r0: GGGGCT at 2467.
  2. SP1M1r4: GGGGCT at 1992, GGGGCT at 1469.
  3. SP1M1r6: GGGGCT at 112.
  4. SP1M1r2ci: AGCCCC at 2564, AGCCCC at 952.
  5. SP1M1r8ci: AGCCCC at 904.

SP1M1r alternate negative direction (odds) (2596-1) distal promoters

  1. SP1M1r1: GGGGCT at 1988, GGGGCT at 822, GGGGCT at 66.
  2. SP1M1r3: GGGGCT at 1923, GGGGCT at 1549.
  3. SP1M1r5: GGGGCT at 771.
  4. SP1M1r9: GGGGCT at 1717.
  5. SP1M1r1ci: AGCCCC at 230.
  6. SP1M1r3ci: AGCCCC at 108, AGCCCC at 52.
  7. SP1M1r5ci: AGCCCC at 1634, AGCCCC at 984.
  8. SP1M1r7ci: AGCCCC at 2175, AGCCCC at 1840, AGCCCC at 270.
  9. SP1M1r9ci: AGCCCC at 420.

SP1M1r arbitrary positive direction (odds) (4050-1) distal promoters

  1. SP1M1r1: GGGGCT at 1988, GGGGCT at 822, GGGGCT at 66.
  2. SP1M1r3: GGGGCT at 3636, GGGGCT at 3404, GGGGCT at 1923, GGGGCT at 1549.
  3. SP1M1r5: GGGGCT at 771.
  4. SP1M1r9: GGGGCT at 2858, GGGGCT at 1717.
  5. SP1M1r1ci: AGCCCC at 230.
  6. SP1M1r3ci: AGCCCC at 108, AGCCCC at 52.
  7. SP1M1r5ci: AGCCCC at 3137, AGCCCC at 1634, AGCCCC at 984.
  8. SP1M1r7ci: AGCCCC at 2175, AGCCCC at 1840, AGCCCC at 270.
  9. SP1M1r9ci: AGCCCC at 420.

SP1M1r alternate positive direction (evens) (4050-1) distal promoters

  1. SP1M1r0: GGGGCT at 2467.
  2. SP1M1r4: GGGGCT at 1992, GGGGCT at 1469.
  3. SP1M1r6: GGGGCT at 3276, GGGGCT at 112.
  4. SP1M1r8: GGGGCT at 3320, GGGGCT at 2825.
  5. SP1M1r2ci: AGCCCC at 2564, AGCCCC at 952.
  6. SP1M1r8ci: AGCCCC at 2720, AGCCCC at 904.

SP1M1 analysis and results

Sp1-box 1 (GGGGCT).[4]

Reals or randoms Promoters direction Numbers Strands Occurrences Averages (± 0.1)
Reals UTR negative 2 2 1 1 ± 0 (--1,+-1)
Randoms UTR arbitrary negative 4 10 0.4 0.4
Randoms UTR alternate negative 4 10 0.4 0.4
Reals Core negative 0 2 0 0
Randoms Core arbitrary negative 1 10 0.1 0.05
Randoms Core alternate negative 0 10 0 0.05
Reals Core positive 2 2 1 1 ± 1 (-+0,++2)
Randoms Core arbitrary positive 0 10 0 0
Randoms Core alternate positive 0 10 0 0
Reals Proximal negative 0 2 0 0
Randoms Proximal arbitrary negative 1 10 0.1 0.05
Randoms Proximal alternate negative 0 10 0 0.05
Reals Proximal positive 1 2 0.5 0.5 ± 0.5 (-+1,++0)
Randoms Proximal arbitrary positive 0 10 0 0
Randoms Proximal alternate positive 0 10 0 0
Reals Distal negative 0 2 0 0
Randoms Distal arbitrary negative 7 10 0.7 1.15
Randoms Distal alternate negative 16 10 1.6 1.15
Reals Distal positive 9 2 4.5 4.5 ± 0.5 (-+4,++5)
Randoms Distal arbitrary positive 20 10 2.0 1.55
Randoms Distal alternate positive 11 10 1.1 1.55

Comparison:

The occurrences of real SP1M1s are greater than the randoms. This suggests that the real SP1M1s are likely active or activable.

Sp1-box 2 (Motojima) Samplings

Copying a responsive elements consensus sequence CTGCCC and putting the sequence in "⌘F" finds two between ZNF497 and A1BG or none between ZSCAN22 and A1BG as can be found by the computer programs.

For the Basic programs testing consensus sequence CTGCCC (starting with SuccessablesSP1M2.bas) written to compare nucleotide sequences with the sequences on either the template strand (-), or coding strand (+), of the DNA, in the negative direction (-), or the positive direction (+), the programs are, are looking for, and found:

  1. negative strand, negative direction, looking for CTGCCC, 0.
  2. positive strand, negative direction, looking for CTGCCC, 1, CTGCCC at 3853.
  3. positive strand, positive direction, looking for CTGCCC, 1, CTGCCC at 412.
  4. negative strand, positive direction, looking for CTGCCC, 2, CTGCCC at 4233, CTGCCC at 741.
  5. complement, negative strand, negative direction, looking for GACGGG, 1, GACGGG at 3853.
  6. complement, positive strand, negative direction, looking for GACGGG, 0.
  7. complement, positive strand, positive direction, looking for GACGGG, 2, GACGGG at 4233, GACGGG at 74.
  8. complement, negative strand, positive direction, looking for GACGGG, 1, GACGGG at 412.
  9. inverse complement, negative strand, negative direction, looking for GGGCAG, 2, GGGCAG at 1510, GGGCAG at 753.
  10. inverse complement, positive strand, negative direction, looking for GGGCAG, 1, GGGCAG at 1822.
  11. inverse complement, positive strand, positive direction, looking for GGGCAG, 1, GGGCAG at 3202.
  12. inverse complement, negative strand, positive direction, looking for GGGCAG, 3, GGGCAG at 3472, GGGCAG at 2895, GGGCAG at 2295.
  13. inverse negative strand, negative direction, looking for CCCGTC, 1, CCCGTC at 1822.
  14. inverse positive strand, negative direction, looking for CCCGTC, 2, CCCGTC at 1510, CCCGTC at 753.
  15. inverse positive strand, positive direction, looking for CCCGTC, 3, CCCGTC at 3472, CCCGTC at 2895, CCCGTC at 2295.
  16. inverse negative strand, positive direction, looking for CCCGTC, 1, CCCGTC at 3202.

SP1M2 (4560-2846) UTRs

  1. Positive strand, negative direction: CTGCCC at 3853.

SP1M2 positive direction (4265-4050) proximal promoters

  1. Negative strand, positive direction: CTGCCC at 4233.

SP1M2 negative direction (2596-1) distal promoters

  1. Negative strand, negative direction: GGGCAG at 1510, GGGCAG at 753.
  2. Positive strand, negative direction: GGGCAG at 1822.

SP1M2 positive direction (4050-1) distal promoters

  1. Negative strand, positive direction: CTGCCC at 741.
  2. Negative strand, positive direction: GGGCAG at 3472, GGGCAG at 2895, GGGCAG at 2295.
  3. Positive strand, positive direction: CTGCCC at 412.
  4. Positive strand, positive direction: GGGCAG at 3202.

SP1M2 random dataset samplings

  1. SP1M2r0: 1, CTGCCC at 2687.
  2. SP1M2r1: 0.
  3. SP1M2r2: 2, CTGCCC at 3347, CTGCCC at 2621.
  4. SP1M2r3: 2, CTGCCC at 135, CTGCCC at 30.
  5. SP1M2r4: 2, CTGCCC at 1518, CTGCCC at 967.
  6. SP1M2r5: 3, CTGCCC at 1970, CTGCCC at 1850, CTGCCC at 121.
  7. SP1M2r6: 2, CTGCCC at 4221, CTGCCC at 1884.
  8. SP1M2r7: 2, CTGCCC at 751, CTGCCC at 616.
  9. SP1M2r8: 0.
  10. SP1M2r9: 3, CTGCCC at 2757, CTGCCC at 731, CTGCCC at 262.
  11. SP1M2r0ci: 0.
  12. SP1M2r1ci: 3, GGGCAG at 3137, GGGCAG at 2651, GGGCAG at 1903.
  13. SP1M2r2ci: 2, GGGCAG at 4548, GGGCAG at 4442.
  14. SP1M2r3ci: 1, GGGCAG at 3330.
  15. SP1M2r4ci: 0.
  16. SP1M2r5ci: 2, GGGCAG at 3905, GGGCAG at 1426.
  17. SP1M2r6ci: 2, GGGCAG at 3797, GGGCAG at 1482.
  18. SP1M2r7ci: 1, GGGCAG at 1997.
  19. SP1M2r8ci: 4, GGGCAG at 3538, GGGCAG at 3533, GGGCAG at 1513, GGGCAG at 1019.
  20. SP1M2r9ci: 1, GGGCAG at 3600.

SP1M2r arbitrary (evens) (4560-2846) UTRs

  1. SP1M2r2: CTGCCC at 3347.
  2. SP1M2r6: CTGCCC at 4221.
  3. SP1M2r2ci: GGGCAG at 4548, GGGCAG at 4442.
  4. SP1M2r6ci: GGGCAG at 3797.
  5. SP1M2r8ci: GGGCAG at 3538, GGGCAG at 3533.

SP1M2r alternate (odds) (4560-2846) UTRs

  1. SP1M2r1ci: GGGCAG at 3137.
  2. SP1M2r3ci: GGGCAG at 3330.
  3. SP1M2r5ci: GGGCAG at 3905.
  4. SP1M2r9ci: GGGCAG at 3600.

SP1M2r alternate positive direction (evens) (4445-4265) core promoters

  1. SP1M2r2ci: GGGCAG at 4442.

SP1M2r arbitrary negative direction (evens) (2811-2596) proximal promoters

  1. SP1M2r2: CTGCCC at 2621.

SP1M2r alternate negative direction (odds) (2811-2596) proximal promoters

  1. SP1M2r9: CTGCCC at 2757.
  2. SP1M2r1ci: GGGCAG at 2651.

SP1M2r alternate positive direction (evens) (4265-4050) proximal promoters

  1. SP1M2r6: CTGCCC at 4221.

SP1M2r arbitrary negative direction (evens) (2596-1) distal promoters

  1. SP1M2r6: CTGCCC at 1884.
  2. SP1M2r6ci: GGGCAG at 1482.
  3. SP1M2r8ci: GGGCAG at 1513, GGGCAG at 1019.

SP1M2r alternate negative direction (odds) (2596-1) distal promoters

  1. SP1M2r3: CTGCCC at 135, CTGCCC at 30.
  2. SP1M2r5: CTGCCC at 1970, CTGCCC at 1850, CTGCCC at 121.
  3. SP1M2r7: CTGCCC at 751, CTGCCC at 616.
  4. SP1M2r9: CTGCCC at 731, CTGCCC at 262.
  5. SP1M2r1ci: GGGCAG at 1903.
  6. SP1M2r5ci: GGGCAG at 1426.

SP1M2r arbitrary positive direction (odds) (4050-1) distal promoters

  1. SP1M2r3: CTGCCC at 135, CTGCCC at 30.
  2. SP1M2r5: CTGCCC at 1970, CTGCCC at 1850, CTGCCC at 121.
  3. SP1M2r7: CTGCCC at 751, CTGCCC at 616.
  4. SP1M2r9: CTGCCC at 2757, CTGCCC at 731, CTGCCC at 262.
  5. SP1M2r1ci: GGGCAG at 3137, GGGCAG at 2651, GGGCAG at 1903.
  6. SP1M2r3ci: GGGCAG at 3330.
  7. SP1M2r5ci: GGGCAG at 3905, GGGCAG at 1426.
  8. SP1M2r9ci: GGGCAG at 3600.

SP1M2r alternate positive direction (evens) (4050-1) distal promoters

  1. SP1M2r2: CTGCCC at 3347, CTGCCC at 2621.
  2. SP1M2r6: CTGCCC at 1884.
  3. SP1M2r6ci: GGGCAG at 3797, GGGCAG at 1482.
  4. SP1M2r8ci: GGGCAG at 3538, GGGCAG at 3533, GGGCAG at 1513, GGGCAG at 1019.

SP1M2 analysis and results

Sp1-box 2 (CTGCCC).[4]

Reals or randoms Promoters direction Numbers Strands Occurrences Averages (± 0.1)
Reals UTR negative 1 2 0.5 0.5 ± 0.5 (--0,+-1)
Randoms UTR arbitrary negative 7 10 0.7 0.55 ± 0.15
Randoms UTR alternate negative 4 10 0.4 0.55 ± 0.15
Reals Core negative 0 2 0 0
Randoms Core arbitrary negative 0 10 0 0
Randoms Core alternate negative 0 10 0 0
Reals Core positive 0 2 0 0
Randoms Core arbitrary positive 0 10 0 0.05 ± 0.05
Randoms Core alternate positive 1 10 0.1 0.05 ± 0.05
Reals Proximal negative 0 2 0 0
Randoms Proximal arbitrary negative 1 10 0.1 0.15 ± 0.05
Randoms Proximal alternate negative 2 10 0.2 0.15 ± 0.05
Reals Proximal positive 1 2 0.5 0.5 ± 0.5 (-+1,++0)
Randoms Proximal arbitrary positive 0 10 0 0.05 ± 0.05
Randoms Proximal alternate positive 1 10 0.1 0.05 ± 0.05
Reals Distal negative 3 2 1.5 1.5 ± 0.5 (--2,+-1)
Randoms Distal arbitrary negative 4 10 0.4 0.75 ± 0.35
Randoms Distal alternate negative 11 10 1.1 0.75 ± 0.35
Reals Distal positive 6 2 3 3 ± 1 (-+4,++2)
Randoms Distal arbitrary positive 17 10 1.7 1.3 ± 0.4
Randoms Distal alternate positive 9 10 0.9 1.3 ± 0.4

Comparison:

The occurrences of real SP1M2 UTRs, proximals and positive distals are greater than the randoms, negative distals overlap the high randoms. This suggests that the real SP1M2s are likely active or activable.

Sp-1 (Sato) samplings

Copying a responsive elements consensus sequence CCGCCCC and putting the sequence in "⌘F" finds none between ZNF497 and A1BG or none between ZSCAN22 and A1BG as can be found by the computer programs.

For the Basic programs testing consensus sequence CCGCCCC (starting with SuccessablesSP1S.bas) written to compare nucleotide sequences with the sequences on either the template strand (-), or coding strand (+), of the DNA, in the negative direction (-), or the positive direction (+), the programs are, are looking for, and found:

  1. negative strand, negative direction, looking for CCGCCCC, 0.
  2. positive strand, negative direction, looking for CCGCCCC, 0.
  3. positive strand, positive direction, looking for CCGCCCC, 1, CCGCCCC at 1027.
  4. negative strand, positive direction, looking for CCGCCCC, 0.
  5. complement, negative strand, negative direction, looking for GGCGGGG, 0.
  6. complement, positive strand, negative direction, looking for GGCGGGG, 0.
  7. complement, positive strand, positive direction, looking for GGCGGGG, 0.
  8. complement, negative strand, positive direction, looking for GGCGGGG, GGCGGGG at 1027.
  9. inverse complement, negative strand, negative direction, looking for GGGGCGG, 0.
  10. inverse complement, positive strand, negative direction, looking for GGGGCGG, 0.
  11. inverse complement, positive strand, positive direction, looking for GGGGCGG, 1, GGGGCGG at 4238.
  12. inverse complement, negative strand, positive direction, looking for GGGGCGG, 4, GGGGCGG at 4439, GGGGCGG at 4429, GGGGCGG at 1793, GGGGCGG at 353.
  13. inverse negative strand, negative direction, looking for CCCCGCC, 0.
  14. inverse positive strand, negative direction, looking for CCCCGCC, 0.
  15. inverse positive strand, positive direction, looking for CCCCGCC, 4, CCCCGCC at 4439, CCCCGCC at 4429, CCCCGCC at 1793, CCCCGCC at 353.
  16. inverse negative strand, positive direction, looking for CCCCGCC, 1, CCCCGCC at 4238.

SP1S positive direction (4445-4265) core promoters

  1. Negative strand, positive direction: GGGGCGG at 4439, GGGGCGG at 4429.

SP1S positive direction (4265-4050) proximal promoters

  1. Positive strand, positive direction: GGGGCGG at 4238.

SP1S positive direction (4050-1) distal promoters

  1. Negative strand, positive direction: GGGGCGG at 1793, GGGGCGG at 353.
  2. Positive strand, positive direction: CCGCCCC at 1027.

Sp-1 (Sato) random dataset samplings

  1. SP1Sr0: 0.
  2. SP1Sr1: 0.
  3. SP1Sr2: 1, CCGCCCC at 641.
  4. SP1Sr3: 2, CCGCCCC at 2794, CCGCCCC at 119.
  5. SP1Sr4: 0.
  6. SP1Sr5: 1, CCGCCCC at 4101.
  7. SP1Sr6: 1, CCGCCCC at 4339.
  8. SP1Sr7: 0.
  9. SP1Sr8: 1, CCGCCCC at 2475.
  10. SP1Sr9: 2, CCGCCCC at 2668, CCGCCCC at 2451.
  11. SP1Sr0ci: 1, GGGGCGG at 3546.
  12. SP1Sr1ci: 0.
  13. SP1Sr2ci: 1, GGGGCGG at 2818.
  14. SP1Sr3ci: 1, GGGGCGG at 518.
  15. SP1Sr4ci: 2, GGGGCGG at 4496, GGGGCGG at 4108.
  16. SP1Sr5ci: 1, GGGGCGG at 2034.
  17. SP1Sr6ci: 1, GGGGCGG at 4350.
  18. SP1Sr7ci: 0.
  19. SP1Sr8ci: 1, GGGGCGG at 241.
  20. SP1Sr9ci: 0.

SP1Sr arbitrary (evens) (4560-2846) UTRs

  1. SP1Sr6: CCGCCCC at 4339.
  2. SP1Sr0ci: GGGGCGG at 3546.
  3. SP1Sr4ci: GGGGCGG at 4496, GGGGCGG at 4108.
  4. SP1Sr6ci: GGGGCGG at 4350.

SP1Sr alternate (odds) (4560-2846) UTRs

  1. SP1Sr5: CCGCCCC at 4101.

SP1Sr arbitrary negative direction (evens) (2846-2811) core promoters

  1. SP1Sr2ci: GGGGCGG at 2818.

SP1Sr alternate positive direction (evens) (4445-4265) core promoters

  1. SP1Sr6: CCGCCCC at 4339.
  2. SP1Sr6ci: GGGGCGG at 4350.

SP1Sr alternate negative direction (odds) (2811-2596) proximal promoters

  1. SP1Sr3: CCGCCCC at 2794.
  2. SP1Sr9: CCGCCCC at 2668.

SP1Sr arbitrary positive direction (odds) (4265-4050) proximal promoters

  1. SP1Sr5: CCGCCCC at 4101.

SP1Sr alternate positive direction (evens) (4265-4050) proximal promoters

  1. SP1Sr4ci: GGGGCGG at 4108.

SP1Sr arbitrary negative direction (evens) (2596-1) distal promoters

  1. SP1Sr2: CCGCCCC at 641.
  2. SP1Sr8ci: GGGGCGG at 241.

SP1Sr alternate negative direction (odds) (2596-1) distal promoters

  1. SP1Sr3: CCGCCCC at 119.
  2. SP1Sr9: CCGCCCC at 2451.
  3. SP1Sr3ci: GGGGCGG at 518.
  4. SP1Sr5ci: GGGGCGG at 2034.

SP1Sr arbitrary positive direction (odds) (4050-1) distal promoters

  1. SP1Sr3: CCGCCCC at 2794, CCGCCCC at 119.
  2. SP1Sr9: CCGCCCC at 2668, CCGCCCC at 2451.
  3. SP1Sr3ci: GGGGCGG at 518.
  4. SP1Sr5ci: GGGGCGG at 2034.

SP1Sr alternate positive direction (evens) (4050-1) distal promoters

  1. SP1Sr2: CCGCCCC at 641.
  2. SP1Sr0ci: GGGGCGG at 3546.
  3. SP1Sr2ci: GGGGCGG at 2818.
  4. SP1Sr8ci: GGGGCGG at 241.

SP1S analysis and results

Sp-1 (CCGCCCC).[5]

Reals or randoms Promoters direction Numbers Strands Occurrences Averages (± 0.1)
Reals UTR negative 0 2 0 0
Randoms UTR arbitrary negative 5 10 0.5 0.3
Randoms UTR alternate negative 1 10 0.1 0.3
Reals Core negative 0 2 0 0
Randoms Core arbitrary negative 1 10 0.1 0.05
Randoms Core alternate negative 0 10 0 0.05
Reals Core positive 2 2 1 1 ± 1 (-+2,++0)
Randoms Core arbitrary positive 0 10 0 0.1
Randoms Core alternate positive 2 10 0.2 0.1
Reals Proximal negative 0 2 0 0
Randoms Proximal arbitrary negative 0 10 0 0.1
Randoms Proximal alternate negative 2 10 0.2 0.1
Reals Proximal positive 1 2 0.5 0.5 ± 0.5 (-+0,++2)
Randoms Proximal arbitrary positive 1 10 0.1 0.1
Randoms Proximal alternate positive 1 10 0.1 0.1
Reals Distal negative 0 2 0 0
Randoms Distal arbitrary negative 2 10 0.2 0.3
Randoms Distal alternate negative 4 10 0.4 0.3
Reals Distal positive 3 2 1.5 1.5 ± 0.5 (-+2,++1)
Randoms Distal arbitrary positive 6 10 0.6 0.5
Randoms Distal alternate positive 4 10 0.4 0.5

Comparison:

The occurrences of real SP1Ss are greater than the randoms. This suggests that the real SP1Ss are likely active or activable.

Sp1 (Yao) samplings

Copying a responsive elements consensus sequence GCGGC and putting the sequence in "⌘F" finds 21 between ZNF497 and A1BG or one between ZSCAN22 and A1BG as can be found by the computer programs.

For the Basic programs testing consensus sequence GCGGC (starting with SuccessablesSP1Y.bas) written to compare nucleotide sequences with the sequences on either the template strand (-), or coding strand (+), of the DNA, in the negative direction (-), or the positive direction (+), the programs are, are looking for, and found:

  1. negative strand, negative direction, looking for GCGGC, 1, GCGGC at 1154.
  2. positive strand, negative direction, looking for GCGGC, 3, GCGGC at 2725, GCGGC at 1753, GCGGC at 957.
  3. positive strand, positive direction, looking for GCGGC, 3, GCGGC at 1758, GCGGC at 1163, GCGGC at 1079.
  4. negative strand, positive direction, looking for GCGGC, 21, GCGGC at 1902, GCGGC at 1794, GCGGC at 1637, GCGGC at 1582, GCGGC at 1438, GCGGC at 1423, GCGGC at 1338, GCGGC at 1323, GCGGC at 1255, GCGGC at 1171, GCGGC at 1148, GCGGC at 1034, GCGGC at 1003, GCGGC at 751, GCGGC at 721, GCGGC at 667, GCGGC at 637, GCGGC at 583, GCGGC at 499, GCGGC at 354, GCGGC at 332.
  5. complement, negative strand, negative direction, looking for CGCCG, 3, CGCCG at 2725, CGCCG at 1753, CGCCG at 957.
  6. complement, positive strand, negative direction, looking for CGCCG, 1, CGCCG at 1154.
  7. complement, positive strand, positive direction, looking for CGCCG, 21, CGCCG at 1902, CGCCG at 1794, CGCCG at 1637, CGCCG at 1582, CGCCG at 1438, CGCCG at 1423, CGCCG at 1338, CGCCG at 1323, CGCCG at 1255, CGCCG at 1171, CGCCG at 1148, CGCCG at 1034, CGCCG at 1003, CGCCG at 751, CGCCG at 721, CGCCG at 667, CGCCG at 637, CGCCG at 583, CGCCG at 499, CGCCG at 354, GCGGC at 332.
  8. complement, negative strand, positive direction, looking for CGCCG, 3, CGCCG at 1758, CGCCG at 1163, CGCCG at 1079.
  9. inverse complement, negative strand, negative direction, looking for GCCGC, 1, GCCGC at 2726.
  10. inverse complement, positive strand, negative direction, looking for GCCGC, 0.
  11. inverse complement, positive strand, positive direction, looking for GCCGC, 9, GCCGC at 1918, GCCGC at 1583, GCCGC at 1548, GCCGC at 1296, GCCGC at 1212, GCCGC at 1044, GCCGC at 638, GCCGC at 540, GCCGC at 355.
  12. inverse complement, negative strand, positive direction, looking for GCCGC, 5, GCCGC at 3226, GCCGC at 2355, GCCGC at 1756, GCCGC at 1648, GCCGC at 903.
  13. inverse negative strand, negative direction, looking for CGGCG, 0.
  14. inverse positive strand, negative direction, looking for CGGCG, 1, CGGCG at 2726.
  15. inverse positive strand, positive direction, looking for CGGCG, 5, CGGCG at 3226, CGGCG at 2355, CGGCG at 1756, CGGCG at 1648, CGGCG at 903.
  16. inverse negative strand, positive direction, looking for CGGCG, 9, CGGCG at 1918, CGGCG at 1583, CGGCG at 1548, CGGCG at 1296, CGGCG at 1212, CGGCG at 1044, CGGCG at 638, CGGCG at 540, CGGCG at 355.

SP1Y negative direction (2811-2596) proximal promoters

  1. Negative strand, negative direction: GCCGC at 2726.
  2. Positive strand, negative direction: GCGGC at 2725.

SP1Y negative direction (2596-1) distal promoters

  1. Negative strand, negative direction: GCGGC at 1154.
  2. Positive strand, negative direction: GCGGC at 1753, GCGGC at 957.

SP1Y positive direction (4050-1) distal promoters

  1. Negative strand, positive direction: GCGGC at 1902, GCGGC at 1794, GCGGC at 1637, GCGGC at 1582, GCGGC at 1438, GCGGC at 1423, GCGGC at 1338, GCGGC at 1323, GCGGC at 1255, GCGGC at 1171, GCGGC at 1148, GCGGC at 1034, GCGGC at 1003, GCGGC at 751, GCGGC at 721, GCGGC at 667, GCGGC at 637, GCGGC at 583, GCGGC at 499, GCGGC at 354, GCGGC at 332.
  2. Negative strand, positive direction: GCCGC at 3226, GCCGC at 2355, GCCGC at 1756, GCCGC at 1648, GCCGC at 903.
  3. Positive strand, positive direction: GCGGC at 1758, GCGGC at 1163, GCGGC at 1079.
  4. Positive strand, positive direction: GCCGC at 1918, GCCGC at 1583, GCCGC at 1548, GCCGC at 1296, GCCGC at 1212, GCCGC at 1044, GCCGC at 638, GCCGC at 540, GCCGC at 355.

SP1Y random dataset samplings

  1. SP1Yr0: 3, GCGGC at 3547, GCGGC at 3444, GCGGC at 3112.
  2. SP1Yr1: 3, GCGGC at 3801, GCGGC at 3104, GCGGC at 1913.
  3. SP1Yr2: 4, GCGGC at 4348, GCGGC at 3749, GCGGC at 3670, GCGGC at 2449.
  4. SP1Yr3: 4, GCGGC at 3413, GCGGC at 2717, GCGGC at 1764, GCGGC at 1442.
  5. SP1Yr4: 5, GCGGC at 4109, GCGGC at 3556, GCGGC at 2614, GCGGC at 1582, GCGGC at 218.
  6. SP1Yr5: 7, GCGGC at 3752, GCGGC at 2932, GCGGC at 2166, GCGGC at 1700, GCGGC at 685, GCGGC at 678, GCGGC at 593.
  7. SP1Yr6: 5, GCGGC at 4434, GCGGC at 4387, GCGGC at 3426, GCGGC at 1665, GCGGC at 1561.
  8. SP1Yr7: 2, GCGGC at 3615, GCGGC at 774.
  9. SP1Yr8: 4, GCGGC at 4280, GCGGC at 3286, GCGGC at 2028, GCGGC at 713.
  10. SP1Yr9: 8, GCGGC at 3896, GCGGC at 3893, GCGGC at 3628, GCGGC at 3536, GCGGC at 2069, GCGGC at 1727, GCGGC at 1427, GCGGC at 887.
  11. SP1Yr0ci: 5, GCCGC at 3406, GCCGC at 3241, GCCGC at 2379, GCCGC at 1383, GCCGC at 370.
  12. SP1Yr1ci: 2, GCCGC at 4114, GCCGC at 1060.
  13. SP1Yr2ci: 4, GCCGC at 3585, GCCGC at 2597, GCCGC at 1965, GCCGC at 1816.
  14. SP1Yr3ci: 4, GCCGC at 4137, GCCGC at 2791, GCCGC at 2375, GCCGC at 1451.
  15. SP1Yr4ci: 7, GCCGC at 2344, GCCGC at 1091, GCCGC at 1088, GCCGC at 1021, GCCGC at 586, GCCGC at 79, GCCGC at 16.
  16. SP1Yr5ci: 5, GCCGC at 4438, GCCGC at 4352, GCCGC at 3323, GCCGC at 2758, GCCGC at 596.
  17. SP1Yr6ci: 3, GCCGC at 2249, GCCGC at 1279, GCCGC at 1219.
  18. SP1Yr7ci: 6, GCCGC at 4386, GCCGC at 4213, GCCGC at 3954, GCCGC at 1910, GCCGC at 1769, GCCGC at 656.
  19. SP1Yr8ci: 4, GCCGC at 2517, GCCGC at 2472, GCCGC at 760, GCCGC at 134.
  20. SP1Yr9ci: 6, GCCGC at 4309, GCCGC at 3883, GCCGC at 2665, GCCGC at 2448, GCCGC at 1414, GCCGC at 568.

SP1Yr arbitrary (evens) (4560-2846) UTRs

  1. SP1Yr0: GCGGC at 3547, GCGGC at 3444, GCGGC at 3112.
  2. SP1Yr2: GCGGC at 4348, GCGGC at 3749, GCGGC at 3670.
  3. SP1Yr4: GCGGC at 4109, GCGGC at 3556.
  4. SP1Yr6: GCGGC at 4434, GCGGC at 4387, GCGGC at 3426.
  5. SP1Yr8: GCGGC at 4280, GCGGC at 3286.
  6. SP1Yr0ci: GCCGC at 3406, GCCGC at 3241.
  7. SP1Yr2ci: GCCGC at 3585.

SP1Yr alternate (odds) (4560-2846) UTRs

  1. SP1Yr1: GCGGC at 3801, GCGGC at 3104.
  2. SP1Yr3: GCGGC at 3413.
  3. SP1Yr5: GCGGC at 3752, GCGGC at 2932.
  4. SP1Yr7: GCGGC at 3615.
  5. SP1Yr9: GCGGC at 3896, GCGGC at 3893, GCGGC at 3628, GCGGC at 3536.
  6. SP1Yr1ci: GCCGC at 4114.
  7. SP1Yr3ci: GCCGC at 4137.
  8. SP1Yr5ci: GCCGC at 4438, GCCGC at 4352, GCCGC at 3323.
  9. SP1Yr7ci: GCCGC at 4386, GCCGC at 4213, GCCGC at 3954.
  10. SP1Yr9ci: GCCGC at 4309, GCCGC at 3883.

SP1Yr arbitrary positive direction (odds) (4445-4265) core promoters

  1. SP1Yr5ci: GCCGC at 4438, GCCGC at 4352.
  2. SP1Yr7ci: GCCGC at 4386.
  3. SP1Yr9ci: GCCGC at 4309.

SP1Yr alternate positive direction (evens) (4445-4265) core promoters

  1. SP1Yr2: GCGGC at 4348.
  2. SP1Yr6: GCGGC at 4434, GCGGC at 4387.
  3. SP1Yr8: GCGGC at 4280.

SP1Yr arbitrary negative direction (evens) (2811-2596) proximal promoters

  1. SP1Yr4: GCGGC at 2614.
  2. SP1Yr2ci: GCCGC at 2597.

SP1Yr alternate negative direction (odds) (2811-2596) proximal promoters

  1. SP1Yr3: GCGGC at 2717.
  2. SP1Yr3ci: GCCGC at 2791.
  3. SP1Yr5ci: GCCGC at 2758.
  4. SP1Yr9ci: GCCGC at 2665.

SP1Yr arbitrary positive direction (odds) (4265-4050) proximal promoters

  1. SP1Yr1ci: GCCGC at 4114.
  2. SP1Yr3ci: GCCGC at 4137.
  3. SP1Yr7ci: GCCGC at 4213.

SP1Yr alternate positive direction (evens) (4265-4050) proximal promoters

  1. SP1Yr4: GCGGC at 4109.

SP1Yr arbitrary negative direction (evens) (2596-1) distal promoters

  1. SP1Yr2: GCGGC at 2449.
  2. SP1Yr4: GCGGC at 1582, GCGGC at 218.
  3. SP1Yr6: GCGGC at 1665, GCGGC at 1561.
  4. SP1Yr8: GCGGC at 2028, GCGGC at 713.
  5. SP1Yr0ci: GCCGC at 2379, GCCGC at 1383, GCCGC at 370.
  6. SP1Yr2ci: GCCGC at 1965, GCCGC at 1816.
  7. SP1Yr6ci: GCCGC at 2249, GCCGC at 1279, GCCGC at 1219.
  8. SP1Yr8ci: GCCGC at 2517, GCCGC at 2472, GCCGC at 760, GCCGC at 134.

SP1Yr alternate negative direction (odds) (2596-1) distal promoters

  1. SP1Yr1: GCGGC at 1913.
  2. SP1Yr3: GCGGC at 1764, GCGGC at 1442.
  3. SP1Yr5: GCGGC at 2166, GCGGC at 1700, GCGGC at 685, GCGGC at 678, GCGGC at 593.
  4. SP1Yr7: GCGGC at 774.
  5. SP1Yr9: GCGGC at 2069, GCGGC at 1727, GCGGC at 1427, GCGGC at 887.
  6. SP1Yr1ci: GCCGC at 1060.
  7. SP1Yr3ci: GCCGC at 2375, GCCGC at 1451.
  8. SP1Yr5ci: GCCGC at 596.
  9. SP1Yr7ci: GCCGC at 1910, GCCGC at 1769, GCCGC at 656.
  10. SP1Yr9ci: GCCGC at 2448, GCCGC at 1414, GCCGC at 568.

SP1Yr arbitrary positive direction (odds) (4050-1) distal promoters

  1. SP1Yr1: GCGGC at 3801, GCGGC at 3104, GCGGC at 1913.
  2. SP1Yr3: GCGGC at 3413, GCGGC at 2717, GCGGC at 1764, GCGGC at 1442.
  3. SP1Yr5: GCGGC at 3752, GCGGC at 2932, GCGGC at 2166, GCGGC at 1700, GCGGC at 685, GCGGC at 678, GCGGC at 593.
  4. SP1Yr7: GCGGC at 3615, GCGGC at 774.
  5. SP1Yr9: GCGGC at 3896, GCGGC at 3893, GCGGC at 3628, GCGGC at 3536, GCGGC at 2069, GCGGC at 1727, GCGGC at 1427, GCGGC at 887.
  6. SP1Yr1ci: GCCGC at 1060.
  7. SP1Yr3ci: GCCGC at 2791, GCCGC at 2375, GCCGC at 1451.
  8. SP1Yr5ci: GCCGC at 3323, GCCGC at 2758, GCCGC at 596.
  9. SP1Yr7ci: GCCGC at 3954, GCCGC at 1910, GCCGC at 1769, GCCGC at 656.
  10. SP1Yr9ci: GCCGC at 3883, GCCGC at 2665, GCCGC at 2448, GCCGC at 1414, GCCGC at 568.

SP1Yr alternate positive direction (evens) (4050-1) distal promoters

  1. SP1Yr2: GCGGC at 3749, GCGGC at 3670, GCGGC at 2449.
  2. SP1Yr4: GCGGC at 3556, GCGGC at 2614, GCGGC at 1582, GCGGC at 218.
  3. SP1Yr6: GCGGC at 3426, GCGGC at 1665, GCGGC at 1561.
  4. SP1Yr8: GCGGC at 3286, GCGGC at 2028, GCGGC at 713.
  5. SP1Yr0ci: GCCGC at 3406, GCCGC at 3241, GCCGC at 2379, GCCGC at 1383, GCCGC at 370.
  6. SP1Yr2ci: GCCGC at 3585, GCCGC at 2597, GCCGC at 1965, GCCGC at 1816.
  7. SP1Yr6ci: GCCGC at 2249, GCCGC at 1279, GCCGC at 1219.
  8. SP1Yr8ci: GCCGC at 2517, GCCGC at 2472, GCCGC at 760, GCCGC at 134.

SP1Y analysis and results

Sp1 (GCGGC).[6]

Reals or randoms Promoters direction Numbers Strands Occurrences Averages (± 0.1)
Reals UTR negative 0 2 0 0
Randoms UTR arbitrary negative 16 10 1.6 1.8
Randoms UTR alternate negative 20 10 2.0 1.8
Reals Core negative 0 2 0 0
Randoms Core arbitrary negative 0 10 0 0
Randoms Core alternate negative 0 10 0 0
Reals Core positive 0 2 0 0
Randoms Core arbitrary positive 4 10 0.4 0
Randoms Core alternate positive 4 10 0.4 0
Reals Proximal negative 2 2 1 1 ± 0 (--1,+-1)
Randoms Proximal arbitrary negative 2 10 0.2 0.3
Randoms Proximal alternate negative 4 10 0.4 0.3
Reals Proximal positive 0 2 0 0
Randoms Proximal arbitrary positive 3 10 0.3 0.2
Randoms Proximal alternate positive 1 10 0.1 0.2
Reals Distal negative 3 2 1.5 1.5 ± 0.5 (--1,+-2)
Randoms Distal arbitrary negative 19 10 1.9 2.1
Randoms Distal alternate negative 23 10 2.3 2.1
Reals Distal positive 38 2 19 19 ± 7 (-+26,++12)
Randoms Distal arbitrary positive 40 10 4.0 3.45
Randoms Distal alternate positive 29 10 2.9 3.45

Comparison:

The occurrences of real SP1Y negative proximals and positive distals are greater than the randoms, negative distals overlap low randoms. This suggests that the real SP1Ys are likely active or activable.

GC box (Zhang) samplings

Consensus sequence is (G/T)GGGCGG(A/G)(A/G)(C/T).[3]

  1. Negative strand, negative direction: 0.
  2. Positive strand, negative direction: 0.
  3. Negative strand, positive direction: 1, TGGGCGGGAC at 409.
  4. Positive strand, positive direction: 0.
  5. inverse complement, negative strand, negative direction: 1, ACTCCGCCCA at 3092.
  6. inverse complement, positive strand, negative direction: 0.
  7. inverse complement, negative strand, positive direction: 0.
  8. inverse complement, positive strand, positive direction: 0.

GC (4560-2846) UTRs

  1. Negative strand, negative direction: ACTCCGCCCA at 3092.

GC positive direction (4050-1) distal promoters

  1. Negative strand, positive direction: TGGGCGGGAC at 409.

GC box (Zhang) random dataset samplings

  1. GCZboxr0: 0.
  2. GCZboxr1: 0.
  3. GCZboxr2: 0.
  4. GCZboxr3: 0.
  5. GCZboxr4: 0.
  6. GCZboxr5: 0.
  7. GCZboxr6: 0.
  8. GCZboxr7: 1, TGGGCGGAGC at 1223.
  9. GCZboxr8: 1, GGGGCGGGGT at 244.
  10. GCZboxr9: 0.
  11. GCZboxr0ci: 0.
  12. GCZboxr1ci: 0.
  13. GCZboxr2ci: 1, GTTCCGCCCC at 641.
  14. GCZboxr3ci: 0.
  15. GCZboxr4ci: 0.
  16. GCZboxr5ci: 0.
  17. GCZboxr6ci: 0.
  18. GCZboxr7ci: 0.
  19. GCZboxr8ci: 0.
  20. GCZboxr9ci: 0.

GCZboxr arbitrary negative direction (evens) (2596-1) distal promoters

  1. GCZboxr8: GGGGCGGGGT at 244.
  2. GCZboxr2ci: GTTCCGCCCC at 641.

GCZboxr alternate negative direction (odds) (2596-1) distal promoters

  1. GCZboxr7: TGGGCGGAGC at 1223.

GCZboxr arbitrary positive direction (odds) (4050-1) distal promoters

  1. GCZboxr7: TGGGCGGAGC at 1223.

GCZboxr alternate positive direction (evens) (4050-1) distal promoters

  1. GCBboxr8: GGGGCGGGGT at 244.
  2. GCBboxr2ci: GTTCCGCCCC at 641.

GC box (Zhang) analysis and results

Consensus sequence is (G/T)GGGCGG(A/G)(A/G)(C/T).[3]

Reals or randoms Promoters direction Numbers Strands Occurrences Averages (± 0.1)
Reals UTR negative 1 2 0.5 0.5 ± 0.5 (--1,+-0)
Randoms UTR arbitrary negative 0 10 0 0
Randoms UTR alternate negative 0 10 0 0
Reals Core negative 0 2 0 0
Randoms Core arbitrary negative 0 10 0 0
Randoms Core alternate negative 0 10 0 0
Reals Core positive 0 2 0 0
Randoms Core arbitrary positive 0 10 0 0
Randoms Core alternate positive 0 10 0 0
Reals Proximal negative 0 2 0 0
Randoms Proximal arbitrary negative 0 10 0 0
Randoms Proximal alternate negative 0 10 0 0
Reals Proximal positive 0 2 0 0
Randoms Proximal arbitrary positive 0 10 0 0
Randoms Proximal alternate positive 0 10 0 0
Reals Distal negative 0 2 0 0
Randoms Distal arbitrary negative 2 10 0.2 0.15
Randoms Distal alternate negative 1 10 0.1 0.15
Reals Distal positive 1 2 0.5 0.5 ± 0.5 (-+1,++0)
Randoms Distal arbitrary positive 1 10 0.1 0.15
Randoms Distal alternate positive 2 10 0.2 0.15

Comparison:

The occurrences of real GC box (Zhang) UTRs and distals are greater than the randoms. This suggests that the real GC box (Zhang)s are likely active or activable.

Acknowledgements

The content on this page was first contributed by: Henry A. Hoff.

Initial content for this page in some instances came from Wikiversity.

See also

References

  1. Wormke M, Stoner M, Saville B, Walker K, Abdelrahim M, Burghardt R, Safe S (March 2003). "The aryl hydrocarbon receptor mediates degradation of estrogen receptor alpha through activation of proteasomes". Mol. Cell. Biol. 23 (6): 1843–55. doi:10.1128/MCB.23.6.1843-1855.2003. PMC 149455. PMID 12612060.
  2. Prasanna KS, Shilpa P, Salimath BP (2009). "Withaferin A suppresses the expression of vascular endothelial growth factor in Ehrlich ascites tumor cells via Sp1 transcription" (PDF). Current Trends in Biotechnology and Pharmacy. 3 (2): 138–148.
  3. 3.0 3.1 3.2 3.3 Bosen Zhang, Liwei Song, Jiali Cai, Lei Li, Hong Xu, Mengying Li, Jiamin Wang, Minmin Shi, Hao Chen, Hao Jia, and Zhaoyuan Hou (17 May 2019). "The LIM protein Ajuba/SP1 complex forms a feed forward loop to induce SP1 target genes and promote pancreatic cancer cell proliferation". Journal of Experimental and Clinical Cancer Research. 38: 205. doi:10.1186/s13046-019-1203-2. PMID 31101117. Retrieved 27 February 2021.
  4. 4.0 4.1 4.2 4.3 Masaru Motojima, Takao Ando and Toshimasa Yoshioka (10 July 2000). "Sp1-like activity mediates angiotensin-II-induced plasminogen-activator inhibitor type-1 (PAI-1) gene expression in mesangial cells" (PDF). Biomedical Journal. 349 (2): 435–441. doi:10.1042/0264-6021:3490435. PMID 10880342. Retrieved 13 August 2020.
  5. 5.0 5.1 Hiroshi Sato, Megumi Kita, and Motoharu Seiki (5 November 1993). "v-Src Activates the Expression of 92-kDa Type IV Collagenase Gene through the AP-1 Site and the GT Box Homologous to Retinoblastoma Control Elements" (PDF). The Journal of Biological Chemistry. 268 (31): 23460–8. PMID 8226872. Retrieved 13 August 2020.
  6. 6.0 6.1 D. W. Yao, J. Luo, Q. Y. He, J. Li, H. Wang, H. B. Shi, H. F. Xu, M. Wang and J. J. Loor (May 2016). "Characterization of the liver X receptor-dependent regulatory mechanism of goat stearoyl-coenzyme A desaturase 1 gene by linoleic acid". Journal of Dairy Science. 99 (5): 3945–3957. doi:10.3168/jds.2015-10601. PMID 26947306. Retrieved 5 September 2020.
  7. RefSeq (November 2014). SP1 Sp1 transcription factor [ Homo sapiens (human) ]. 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information, U.S. National Library of Medicine. Retrieved 16 November 2018.
  8. RefSeq (July 2008). SP2 Sp2 transcription factor [ Homo sapiens (human) ]. 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information, U.S. National Library of Medicine. Retrieved 16 November 2018.
  9. RefSeq (February 2010). SP3 Sp3 transcription factor [ Homo sapiens (human) ]. 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information, U.S. National Library of Medicine. Retrieved 16 November 2018.
  10. RefSeq (May 2016). SP4 Sp4 transcription factor [ Homo sapiens (human) ]. 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information, U.S. National Library of Medicine. Retrieved 16 November 2018.
  11. OMIM (March 2008). SP6 Sp6 transcription factor [ Homo sapiens (human) ]. 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information, U.S. National Library of Medicine. Retrieved 16 November 2018.
  12. RefSeq (July 2010). SP7 Sp7 transcription factor [ Homo sapiens (human) ]. 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information, U.S. National Library of Medicine. Retrieved 16 November 2018.
  13. RefSeq (June 2011). SP8 Sp8 transcription factor [ Homo sapiens (human) ]. 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information, U.S. National Library of Medicine. Retrieved 16 November 2018.
  14. Fernandez-Zapico ME, Lomberk GA, Tsuji S, Demars CJ, Bardsley MR, Lin YH, Almada L, Han JJ, Mukhopadhyay D, Ordog T, Buttar NS, Urrutia R (December 2010). "A Functional Family-Wide Screening of SP/KLF Proteins Identifies a Subset of Suppressors of KRAS-Mediated Cell Growth". Biochem J. 435 (2): 529–37. doi:10.1042/BJ20100773. PMC 3130109. PMID 21171965.
  15. EntrezGene 6670
  16. Essafi-Benkhadir K; Grosso S; Puissant A; Robert G; Essafi M; Deckert M; Chamorey E; Dassonville O; Milano G; Auberger P; Pag?s G (2009). "Dual role of Sp3 transcription factor as an inducer of apoptosis and a marker of tumour aggressiveness". PLoS ONE. 4 (2): e4478. doi:10.1371/journal.pone.0004478. PMC 2636865. PMID 19212434.
  17. EntrezGene 6671
  18. 18.0 18.1 EntrezGene 170574
  19. EntrezGene 381373
  20. EntrezGene 64406
  21. EntrezGene 83395

External links