Complex locus A1BG and ZNF497

Jump to navigation Jump to search

Alpha-1-B glycoprotein is a 54.3 kDa protein in humans that is encoded by the A1BG gene.[1] The protein encoded by this gene is a plasma glycoprotein of unknown function. The protein shows sequence similarity to the variable regions of some immunoglobulin supergene family member proteins.

A1BG is located on the negative DNA strand of chromosome 19 from 58,858,172 – 58,864,865.[2] Additionally, A1BG is located directly adjacent to the ZSCAN22 gene (58,838,385-58,853,712) on the positive DNA strand, as well as the ZNF837 (58,878,990 - 58,892,389, complement) and ZNF497 (58865723 - 58,874,214, complement) genes on the negative strand.[2]

ZSCAN22

  1. Gene ID: 342945 is ZSCAN22 zinc finger and SCAN domain containing 22.[3] ZSCAN22 is transcribed in the negative direction from LOC100887072.[3]
  2. Gene ID: 102465484 is MIR6806.[4] MIR6806 is transcribed in the negative direction from LOC105372480.[4]

Alpha-1-B glycoprotein

Def. "a substance that induces an immune response, usually foreign"[5] is called an antigen.

Def. any "substance that elicits [an] immune response"[6] is called an immunogen.

An antigen "or immunogen is a molecule that sometimes stimulates an immune system response."[7] But, "the immune system does not consist of only antibodies",[7] instead it "encompasses all substances that can be recognized by the adaptive immune system."[7]

Def. "a protein produced by B-lymphocytes that binds to an [a specific][8] antigen"[9] is called an antibody.

Five different antibody isotypes are known in mammals, which perform different roles, and help direct the appropriate immune response for each different type of foreign object they encounter.[10]

Although the general structure of all antibodies is very similar, a small region, known as the hypervariable region, at the tip of the protein is extremely variable, allowing millions of antibodies with slightly different tip structures to exist, where each of these variants can bind to a different target, known as an antigen.[11]

Def. "any of the glycoproteins in blood serum that respond to invasion by foreign antigens and that protect the host by removing pathogens;"[12] "an antibody"[13] is called an immunoglobulin.

  1. Gene ID: 1 is Alpha-1-B glycoprotein, a 54.3 kDa protein in humans that is encoded by the A1BG gene.[14] A1BG is transcribed in the positive direction from ZNF497.[14] The protein encoded by this gene is a plasma glycoprotein of unknown function. The protein shows sequence similarity to the variable regions of some immunoglobulin supergene family member proteins. Patients who have pancreatic ductal adenocarcinoma show an overexpression of A1BG in pancreatic juice.[15]
  2. Gene ID: 503538 is A1BG-AS1 A1BG antisense RNA 1.[16] A1BG-AS1 is transcribed in the negative direction from ZSCAN22.[16]

Immunoglobulin supergene family

"𝛂1B-glycoprotein(𝛂1B) [...] consists of a single polypeptide chain N-linked to four glucosamine oligosaccharides. The polypeptide has five intrachain disulfide bonds and contains 474 amino acid residues. [...] 𝛂1B exhibits internal duplication and consists of five repeating structural domains, each containing about 95 amino acids and one disulfide bond. [...] several domains of 𝛂1B, especially the third, show statistically significant homology to variable regions of certain immunoglobulin light and heavy chains. 𝛂1B [...] exhibits sequence similarity to other members of the immunoglobulin supergene family such as the receptor for transepithelial transport of IgA and IgM and the secretory component of human IgA."[17]

"Some of the domains of 𝛂1B show significant homology to variable (V) and constant (C) regions of certain immunoglobulins. Likewise, there is statistically significant homology between 𝛂1B and the secretory component (SC) of human IgA (15) and also with the extracellular portion of the rabbit receptor for transepithelial transport of polymeric immunoglobulins (IgA and IgM). Mostov et al. (16) have called the later protein the poly-Ig receptor or poly-IgR and have shown that it is the precursor of SC."[17]

The immunoglobulin supergene family is "the group of proteins that have immunoglobulin-like domains, including histocompatibility antigens, the T-cell antigen receptor, poly-IgR, and other proteins involved in the vertebrate immune response (17)."[17]

"The internal homology in primary structure [...] and the presence of an intrasegment disulfide bond suggest that 𝛂1B is composed of five structural domains that arose by duplication of a primordial gene coding for about 95 amino acid residues."[17]

"Unlike immunoglobulins (25), ceruloplasmin (6), and hemopexin (7), 𝛂1B is not subject to limited interdomain cleavage by proteolytic enzymes. At least, we were not able to produce such fragments by use of a variety of proteases. This stability of 𝛂1B is probably associated with the frequency of proline in the sequences linking the domains [...]."[17]

"A peptide identified in the late and early milk proteomes showed homology to eutherian alpha 1B glycoprotein (A1BG), a plasma protein with unknown function46, as well as venom inhibitors characterised in the Southern opossum Didelphis marsupialis (DM43 and DM4647,48,49), all members of the immunoglobulin superfamily. To characterise the relationship between the peptide sequence identified in koala, A1BG, DM43 and DM46, a phylogenetic tree was constructed [...] including all marsupial and monotreme homologs (identified by BLAST), three phylogenetically representative eutherian sequences, with human IGSF1 and TARM1, related members of the immunoglobulin super family, used as outgroups. This phylogeny indicates that A1BG-like proteins in marsupials and the Didelphis antitoxic proteins are homologs of eutherian A1BG, with excellent bootstrap support (98%). The marsupial A1BG-like sequences and the Didelphis antitoxic proteins formed a single clade with strong bootstrap support (97%)."[18]

"Human TARM1 and IGSF1, related members of the immunoglobulin superfamily are used as outgroups. The tree was constructed using the maximum likelihood approach and the JTT model with bootstrap support values from 500 bootstrap tests. Bootstrap values less than 50% are not displayed. Accession numbers: Tasmanian devil (Sarcophilus harrisii; XP_012402143), Wallaby (Macropus eugenii; FY619507), Possum (Trichosurus vulpecula; DY596639) Virginia opossum (Didelphis virginiana; AAA30970, AAN06914), Southern opossum (Didelphis marsupialis; AAL82794, P82957, AAN64698), Human (Homo sapiens; P04217, B6A8C7, Q8N6C5), Platypus (Ornithorhychus anatinus; ENSOANP00000000762), Cow (Bos taurus; Q2KJF1), Alpaca (Vicugna pacos; XP_015107031)."[18]

A1BG protein species

Def. a "group of plants or animals having similar appearance"[19] or "the largest group of organisms in which [any][20] two individuals [of the appropriate sexes or mating types][20] can produce fertile offspring, typically by sexual reproduction"[21] is called a species.

The gene contains 20 distinct introns.[22] Transcription produces 15 different mRNAs, 10 alternatively spliced variants and 5 unspliced forms.[22] There are 4 probable alternative promoters, 4 non overlapping alternative last exons and 7 validated alternative polyadenylation sites.[22] The mRNAs appear to differ by truncation of the 5' end, truncation of the 3' end, presence or absence of 4 cassette exons, overlapping exons with different boundaries, splicing versus retention of 3 introns.[22]

Variants or isoforms

Def. a "different sequence of a gene (locus)"[23] is called a variant.

Def. any "of several different forms of the same protein, arising from either single nucleotide polymorphisms,[24] differential splicing of mRNA, or post-translational modifications (e.g. sulfation, glycosylation, etc.)"[25] is called an isoform.

Regarding additional isoforms, mention has been made of "new genetic variants of A1BG."[26]

"Proteomic analysis revealed that [a circulating] set of plasma proteins was α 1 B-glycoprotein (A1BG) and its post-translationally modified isoforms."[27]

Pharmacogenomic variants have been reported.[28]

Genotypes

Def. the "part (DNA sequence) of the genetic makeup of an organism which determines a specific characteristic (phenotype) of that organism"[29] or a "group of organisms having the same genetic constitution" [30]is called a genotype.

There are A1BG genotypes.[28]

A1BG has a genetic risk score of rs893184.[28]

"A genetic risk score, including rs16982743, rs893184, and rs4525 in F5, was significantly associated with treatment-related adverse cardiovascular outcomes in whites and Hispanics from the INVEST study and in the Nordic Diltiazem study (meta-analysis interaction P=2.39×10−5)."[28]

Polymorphs

Def. the "regular existence of two or more different genotypes within a given species or population; also, variability of amino acid sequences within a gene's protein"[31] is called polymorphism.

Def. "one of a number of alternative forms of the same gene occupying a given position, [or locus],[32] on a chromosome"[33] is called an allele.

"rs893184 causes a histidine (His) to arginine (Arg) [nonsynonymous single nucleotide polymorphism (nsSNP), A (minor) for G (major)] substitution at amino acid position 52 in A1BG."[28]

"Genetic polymorphism of human plasma (serum) alpha 1B-glycoprotein (alpha 1B) was observed using one-dimensional horizontal polyacrylamide gel electrophoresis (PAGE) pH 9.0 of plasma samples followed by Western blotting with specific antiserum to alpha 1B."[34]

"Genetic polymorphism of human plasma 𝜶1B-glycoprotein (𝜶1B) was reported first, in brief, by Altland et al. [1983; also given in Altkand and Hacklar, 1984]. A detailed description of human 𝜶1B polymorphism was reported in subsequent studies [Gahne et al., 1987; Juneja et al., 1988, 1989]. Five different 𝜶1B alleles (A1B*1, A1B*2, A1B*3, A1B*4 and A1B*5) were reported. In Caucasian whites, the frequencies of A1B*1 and ''A1B*2 were about 0.95 and 0.05, respectively. A1B*4 was observed in 2 related Czech individuals. In American blacks, A1B*1 and A1B*2 occurred with a frequency of 0.73 and 0.21, respectively, while a new allele, viz, A1B*3 had a frequency of 0.06. A1B*5 was observed only in Swedish Lapps and in Finns with a frequency of 0.04 and 0.007, respectively."[35]

"The frequency of A1B*1 varied from 0.89 to 0.91 and that of A1B*2 from 0.08 to 0.10. The A1B*3 allele, reported previously only in American blacks, was observed with a frequency range of 0.003-0.01 in 3 of the Chinese populations, in Koreans and in Malays. A new 𝜶1B allele (A1B*6) was observed in 2 Chinese individuals."[35]

Phenotypes

Def. the "appearance of an organism based on a single trait [multifactorial combination of genetic traits and environmental factors][36], especially used in pedigrees"[37] or any "observable characteristic of an organism, such as its morphological, developmental, biochemical or physiological properties, or its behavior"[38] is called a phenotype.

"The three different phenotypes of α1B observed (designated 1-1, 1-2, and 2-2) were apparently identical to those reported by Altland et al. (1983), who used double one-dimensional electrophoresis. Family data supported the hypothesis that the three α1B phenotypes are determined by two codominant alleles at an autosomal locus, designated A1B. Allele frequencies in a Swedish population were: A1B *1, 0.937; A1B *2, 0.063; PIC, 0.111."[34]

Protein species

"Both protein species of [alpha 1-beta glycoprotein] A1B (A1Ba, p = 0.008; f.c.= +1.62, A1Bb, p = 0.003; f.c. = +1.82) [...] were apparently overexpressed in patients with PTCa [...]."[39]

A1BG is mainly produced in the liver, and is secreted to plasma to levels of approximately 0.22 mg/mL.[17]

CRISPs

The human cysteine-rich secretory protein (CRISP3) "is present in exocrine secretions and in secretory granules of neutrophilic granulocytes and is believed to play a role in innate immunity."[40] CRISP3 has a relatively high content in human plasma.[40]

"The A1BG-CRISP-3 complex is noncovalent with a 1:1 stoichiometry and is held together by strong electrostatic forces."[40] "Similar [complex formation] between toxins from snake venom and A1BG-like plasma proteins ... inhibits the toxic effect of snake venom metalloproteinases or myotoxins and protects the animal from envenomation."[40]

Opossums have a remarkably robust immune system, and show partial or total immunity to the venom of rattlesnakes, Agkistrodon piscivorus, cottonmouths, and other Crotalinae, pit vipers.[41][42]

"Crisp3 [is] mainly [expressed] in the salivary glands, pancreas, and prostate."[43] "CRISP3 is highly expressed in the human cauda epididymidis and ampulla of vas deferens (Udby et al. 2005)."[43]

ZNF497

  1. Gene ID: 503538 is A1BG-AS1 A1BG antisense RNA 1.[16] A1BG-AS1 is transcribed in the negative direction from ZSCAN22.[16]
  2. Gene ID: 162968 is ZNF497 zinc finger protein 497.[44] ZNF497 is transcribed in the positive direction from RNA5SP473.[44]
  3. Gene ID: 100419840 is LOC100419840 zinc finger protein 446 pseudogene.[45] LOC100419840 may be transcribed in the positive direction from LOC105372483.[45]
  4. Gene ID: 105372483 is LOC105372483 uncharacterized LOC105372483 ncRNA.[46] LOC105372483 is transcribed in the negative direction from LOC100419840.[46]
  5. Gene ID: 106479017 is RNA5SP473 RNA, 5S ribosomal pseudogene 473.[47] RNA5SP473 may be transcribed in the negative direction from ZNF497.[47]

A boxes

There is one A box on the positive strand in the negative direction (from ZSCAN22 to A1BG): 3'-TGACTCT-5' at 2788.

There is one A box complement on the negative strand in the negative direction: 3'-ACTGAGA-5' at 2788.

There is one A box inverse complement on the negative strand in the positive direction: 3'-AGAGTCA-5' at 2613.

There is one A box inverse on the positive strand in the positive direction: 3'-TCTCAGT-5' at 2613.

ACGT-containing elements

  1. ACGT elements, negative strand, negative direction: 24, 3'-ACGT-5' at 150, 3'-ACGT-5' at 1030, 3'-ACGT-5' at 1321, 3'-ACGT-5' at 1337, 3'-ACGT-5' at 1345, 3'-ACGT-5' at 1470, 3'-ACGT-5' at 1494, 3'-ACGT-5' at 1535, 3'-ACGT-5' at 1717, 3'-ACGT-5' at 1974, 3'-ACGT-5' at 1998, 3'-ACGT-5' at 2081, 3'-ACGT-5' at 2400, 3'-ACGT-5' at 2424, 3'-ACGT-5' at 2735, 3'-ACGT-5' at 2759, 3'-ACGT-5' at 2863, 3'-ACGT-5' at 3287, 3'-ACGT-5' at 3429, 3'-ACGT-5' at 3771, 3'-ACGT-5' at 4245, 3'-ACGT-5' at 4315, 3'-ACGT-5' at 4330, 3'-ACGT-5' at 4338.
  2. ACGT elements, negative strand, positive direction: 2, 3'-ACGT-5' at 569, 3'-ACGT-5' at 3254.
  3. ACGT elements, positive strand, negative direction: 4, 3'-ACGT-5' at 342, 3'-ACGT-5' at 531, 3'-ACGT-5' at 1772, 3'-ACGT-5' at 4236.
  4. ACGT elements, positive strand, positive direction: 44, 3'-ACGT-5' at 192, 3'-ACGT-5' at 224, 3'-ACGT-5' at 436, 3'-ACGT-5' at 531, 3'-ACGT-5' at 546, 3'-ACGT-5' at 656, 3'-ACGT-5' at 783, 3'-ACGT-5' at 1119, 3'-ACGT-5' at 1218, 3'-ACGT-5' at 1370, 3'-ACGT-5' at 1470, 3'-ACGT-5' at 1505, 3'-ACGT-5' at 1613, 3'-ACGT-5' at 1786, 3'-ACGT-5' at 1820, 3'-ACGT-5' at 1935, 3'-ACGT-5' at 2063, 3'-ACGT-5' at 2204, 3'-ACGT-5' at 2326, 3'-ACGT-5' at 2334, 3'-ACGT-5' at 2350, 3'-ACGT-5' at 2681, 3'-ACGT-5' at 2690, 3'-ACGT-5' at 2719, 3'-ACGT-5' at 2743, 3'-ACGT-5' at 2800, 3'-ACGT-5' at 2857, 3'-ACGT-5' at 2960, 3'-ACGT-5' at 3061, 3'-ACGT-5' at 3070, 3'-ACGT-5' at 3142, 3'-ACGT-5' at 3230, 3'-ACGT-5' at 3268, 3'-ACGT-5' at 3279, 3'-ACGT-5' at 3320, 3'-ACGT-5' at 3341, 3'-ACGT-5' at 3400, 3'-ACGT-5' at 3459, 3'-ACGT-5' at 3464, 3'-ACGT-5' at 3829, 3'-ACGT-5' at 3883, 3'-ACGT-5' at 3960, 3'-ACGT-5' at 4315, 3'-ACGT-5' at 4341.

ACGT-containing elements include these metal responsive elements:

  1. complement, negative strand, negative direction: 6, 3'-ACGTGAG-5' at 1348, 3'-ACGTGAG-5' at 2001, 3'-ACGTGAG-5' at 2427, 3'-ACGTGGG-5' at 2762, 3'-ACGTGAG-5' at 3290, and 3'-ACGTGAG-5' at 4341.
  2. complement, positive strand, negative direction: 6, 3'-ACGTGTG-5' at 549, 3'-ACGTGTG-5' at 1221, 3'-ACGTGAG-5' at 1373, 3'-ACGTGAG-5' at 1473, 3'-ACGTGTG-5' at 2963, 3'-ACGTGGG-5' at 3323.
  3. inverse, negative strand, negative direction: 2, 3'-CTCACGT-5' at 1470, 3'-CACACGT-5' at 2863.
  4. inverse, positive strand, negative direction: 2, 3'-CACACGT-5' at 531, 3'-CTCACGT-5' at 1772.
  5. inverse, positive strand, positive direction: 6, 3'-CGCACGT-5' at 546, 3'-CGCACGT-5' at 1218, 3'-CTCACGT-5' at 1786, 3'-CTCACGT-5' at 2326, 3'-CCCACGT-5' at 2800, 3'-CCCACGT-5' at 3883.

ACGT-containing elements include these cAMP response elements (CRE):

  1. negative strand in the negative direction (from ZSCAN22 to A1BG): 1, 3'-TGACGTCA-5' at 4317.

AGC boxes

An inverse AGC box occurs negative strand, negative direction, 3'-CCGCCGA-5' at 1754 nts from ZSCAN22 toward A1BG in the distal promoter with its complement on the positive strand, negative direction.

Angiotensinogen core promoter elements

  1. AGCE, negative strand, negative direction, looking for 3'-A/C-T-C/T-G-T-G-5': 4, 3'-ATTGTG-5' at 340, 3'-ATCGTG-5' at 2096, 3'-CTTGTG-5' at 3669, 3'-CTCGTG-5' at 3914.
  2. AGCE, negative strand, positive direction, looking for 3'-A/C-T-C/T-G-T-G-5': 2, 3'-ATTGTG-5' at 2679, 3'-CTCGTG-5' at 4376.
  3. AGCE, positive strand, negative direction, looking for 3'-A/C-T-C/T-G-T-G-5': 0.
  4. AGCE, positive strand, positive direction, looking for 3'-A/C-T-C/T-G-T-G-5': 6, 3'-CTCGTG-5' at 855, 3'-CTCGTG-5' at 955, 3'-CTCGTG-5' at 1207, 3'-CTCGTG-5' at 1627, 3'-CTTGTG-5' at 3095, 3'-CTCGTG-5' at 3739.
  5. AGCEc, negative strand, negative direction, looking for 3'-G/T-A-A/G-C-A-C-5': 0.
  6. AGCEc, negative strand, positive direction, looking for 3'-G/T-A-A/G-C-A-C-5': 6, 3'-GAGCAC-5' at 855, 3'-GAGCAC-5' at 955, 3'-GAGCAC-5' at 1207, 3'-GAGCAC-5' at 1627, 3'-GAACAC-5' at 3095, 3'-GAGCAC-5' at 3739.
  7. AGCEc, positive strand, negative direction, looking for 3'-G/T-A-A/G-C-A-C-5': 4, 3'-TAACAC-5' at 340, 3'-TAGCAC-5' at 2096, 3'-GAACAC-5' at 3669, 3'-GAGCAC-5' at 3914.
  8. AGCEc, positive strand, positive direction, looking for 3'-G/T-A-A/G-C-A-C-5': 2, 3'-TAACAC-5' at 2679, 3'-GAGCAC-5' at 4376.
  9. AGCEci, negative strand, negative direction, looking for 3'-C-A-C-A/G-A-G/T-5': 2, 3'-CACGAT-5' at 336, 3'-CACGAG-5' at 4403.
  10. AGCEci, negative strand, positive direction, looking for 3'-C-A-C-A/G-A-G/T-5': 1, 3'-CACGAG-5' at 243.
  11. AGCEci, positive strand, negative direction, looking for 3'-C-A-C-A/G-A-G/T-5': 10, 3'-CACGAG-5' at 435, 3'-CACGAG-5' at 572, 3'-CACGAG-5' at 708, 3'-CACGAG-5' at 1182, 3'-CACAAT-5' at 1721, 3'-CACAAG-5' at 2244, 3'-CACGAG-5' at 3232, 3'-CACAAT-5' at 3515, 3'-CACAAG-5' at 3634, 3'-CACGAG-5' at 4472.
  12. AGCEci, positive strand, positive direction, looking for 3'-C-A-C-A/G-A-G/T-5': 3, 3'-CACAAG-5' at 107, 3'-CACGAG-5' at 2090, 3'-CACGAG-5' at 3152.
  13. AGCEi, negative strand, negative direction, looking for 3'-G-T-G-C/T-T-A/C-5': 10, 3'-GTGCTC-5' at 435, 3'-GTGCTC-5' at 572, 3'-GTGCTC-5' at 708, 3'-GTGCTC-5' at 1182, 3'-GTGTTA-5' at 1721, 3'-GTGTTC-5' at 2244, 3'-GTGCTC-5' at 3232, 3'-GTGTTA-5' at 3515, 3'-GTGTTC-5' at 3634, 3'-GTGCTC-5' at 4472.
  14. AGCEi, negative strand, positive direction, looking for 3'-G-T-G-C/T-T-A/C-5': 3, 3'-GTGTTC-5' at 107, 3'-GTGCTC-5' at 2090, 3'-GTGCTC-5' at 3152.
  15. AGCEi, positive strand, negative direction, looking for 3'-G-T-G-C/T-T-A/C-5': 2, 3'-GTGCTA-5' at 336, 3'-GTGCTC-5' at 4403.
  16. AGCEi, positive strand, positive direction, looking for 3'-G-T-G-C/T-T-A/C-5': 0.

ATA boxes

Core promoters

There is the following inverse ATA box on the negative strand, negative direction: 1, 3'-AAATAA-5' at 4537 inside A1BG as the TSS is at 4460 nts from ZSCAN22.

Proximal promoters

There is the following inverse ATA box on the positive strand, negative direction: 3'-AAATAA-5' at 4221.

There is one inverse and inverse complement between 4050 and 4300 in the positive direction: 3'-AAATAA-5' at 4142, and 3'-TTTATT-5' at 4142.

Distal promoters

There is the following ATA box on the negative strand in the negative direction: 1, 3'-AATAAA-5' at 1726 nts from ZSCAN22.

There are the following ATA boxes on the positive strand in the negative direction: 3, 3'-AATAAA-5' at 3014, 3'-AATAAA-5' at 3335, and 3'-AATAAA-5' at 4072.

There are the following inverse ATA boxes on the positive strand, negative direction: 4, 3'-AAATAA-5' at 3013, 3'-AAATAA-5' at 3334, 3'-AAATAA-5' at 4071, 3'-AAATAA-5' at 4075.

There is the following ATA box on the negative strand in the positive direction: 1, 3'-AATAAA-5' at 3427. It has a complement on the positive strand in the positive direction: 1, 3'-TTATTT-5' at 3427.

There is another inverse complement ATA box on the negative strand in the positive direction in distal promoter: 3'-TTTATT-5' at 2347. It also has an inverse in the distal promoter: 3'-AAATAA-5' at 2347.

B boxes

While there appear to be at least two B boxes, TGGGCA is one B-box,[48] where the "mP2 EB fragment used for binding was the 118 nucleotide fragment extending from the Dde I site at position -140 to the Dde I site at position -23 [...]. This fragment contains the GC, E, B, CAAT, and TATA boxes."[48]

  1. negative strand in the negative direction, looking for 3'-TGGGCA-5', 0.
  2. negative strand in the positive direction, looking for 3'-TGGGCA-5', 4, 3'-TGGGCA-5' at 27, 3'-TGGGCA-5' at 1945, 3'-TGGGCA-5' at 2894, 3'-TGGGCA-5' at 4180.
  3. positive strand in the negative direction, looking for 3'-TGGGCA-5', 9, 3'-TGGGCA-5' at 462, 3'-TGGGCA-5' at 902, 3'-TGGGCA-5' at 1114, 3'-TGGGCA-5' at 1359, 3'-TGGGCA-5' at 2438, 3'-TGGGCA-5' at 2773, 3'-TGGGCA-5' at 3301, 3'-TGGGCA-5' at 4040, 3'-TGGGCA-5' at 4191.
  4. positive strand in the positive direction, looking for 3'-TGGGCA-5', 0,
  5. complement, negative strand, negative direction, looking for 3'-ACCCGT-5', 9, 3'-ACCCGT-5' at 462, 3'-ACCCGT-5' at 902, 3'-ACCCGT-5' at 1114, 3'-ACCCGT-5' at 1359, 3'-ACCCGT-5' at 2438, 3'-ACCCGT-5' at 2773, 3'-ACCCGT-5' at 3301, 3'-ACCCGT-5' at 4040, 3'-ACCCGT-5' at 4191.
  6. complement, negative strand, positive direction, looking for 3'-ACCCGT-5', 0.
  7. complement, positive strand, negative direction, looking for 3'-ACCCGT-5', 0.
  8. complement, positive strand, positive direction, looking for 3'-ACCCGT-5', 4, 3'-ACCCGT-5' at 27, 3'-ACCCGT-5' at 1945, 3'-ACCCGT-5' at 2894, 3'-ACCCGT-5' at 4180.
  9. inverse complement, negative strand, negative direction, looking for 3'-TGCCCA-5', 0.
  10. inverse complement, negative strand, positive direction, looking for 3'-TGCCCA-5', 2, 3'-TGCCCA-5' at 3237, 3'-TGCCCA-5' at 3377.
  11. inverse complement, positive strand, negative direction, looking for 3'-TGCCCA-5', 4, 3'-TGCCCA-5' at 1458, 3'-TGCCCA-5' at 3854, 3'-TGCCCA-5' at 3883, 3'-TGCCCA-5' at 4251.
  12. inverse complement, positive strand, positive direction, looking for 3'-TGCCCA-5', 1, 3'-TGCCCA-5' at 3750.
  13. inverse, negative strand, negative direction, looking for 3'-ACGGGT-5', 4, 3'-ACGGGT-5' at 1458, 3'-ACGGGT-5' at 3854, 3'-ACGGGT-5' at 3883, 3'-ACGGGT-5' at 4251.
  14. inverse, negative strand, positive direction, looking for 3'-ACGGGT-5', 1, 3'-ACGGGT-5' at 3750.
  15. inverse, positive strand, negative direction, looking for 3'-ACGGGT-5', 0.
  16. inverse, positive strand, positive direction, looking for 3'-ACGGGT-5', 2, 3'-ACGGGT-5' at 3237, 3'-ACGGGT-5' at 3377.

The other is associated with the human transforming growth factor b1 binding sequences.[49]

And, has the consensus sequence 3'-TGTCTCA-5'. Let it be designated B1box.

  1. negative strand in the negative direction, looking for 3'-TGTCTCA-5', 2, 3'-TGTCTCA-5' at 1075, 3'-TGTCTCA-5' at 2445.
  2. negative strand in the positive direction, looking for 3'-TGTCTCA-5', 2, 3'-TGTCTCA-5'at 2174, 3'-TGTCTCA-5' at 2468.
  3. positive strand in the negative direction, looking for 3'-TGTCTCA-5', 5, 3'-TGTCTCA-5' at 923, 3'-TGTCTCA-5' at 1089, 3'-TGTCTCA-5' at 2033, 3'-TGTCTCA-5' at 3323, 3'-TGTCTCA-5' at 4373.
  4. positive strand in the positive direction, looking for 3'-TGTCTCA-5', 0.
  5. complement, negative strand, negative direction, looking for 3'-ACAGAGT-5', 5, 3'-ACAGAGT-5' at 923, 3'-ACAGAGT-5' at 1089, 3'-ACAGAGT-5' at 2033, 3'-ACAGAGT-5' at 3323, 3'-ACAGAGT-5' at 4373.
  6. complement, negative strand, positive direction, looking for 3'-ACAGAGT-5', 0.
  7. complement, positive strand, negative direction, looking for 3'-ACAGAGT-5', 2, 3'-ACAGAGT-5' at 1075, 3'-ACAGAGT-5' at 2445.
  8. complement, positive strand, positive direction, looking for 3'-ACAGAGT-5', 2, 3'-ACAGAGT-5' at 2174, 3'-ACAGAGT-5' at 2468.
  9. inverse complement, negative strand, negative direction, looking for 3'-TGAGACA-5', 3, 3'-TGAGACA-5' at 919, 3'-TGAGACA-5' at 1085, 3'-TGAGACA-5' at 2029.
  10. inverse complement, negative strand, positive direction, looking for 3'-TGAGACA-5', 0.
  11. inverse complement, positive strand, negative direction, looking for 3'-TGAGACA-5', 0.
  12. inverse complement, positive strand, positive direction, looking for 3'-TGAGACA-5', 1, 3'-TGAGACA-5' at 2308.
  13. inverse, negative strand, negative direction, looking for 3'-ACTCTGT-5', 0.
  14. inverse, negative strand, positive direction, looking for 3'-ACTCTGT-5', 1, 3'-ACTCTGT-5' at 2308.
  15. inverse, positive strand, negative direction, looking for 3'-ACTCTGT-5', 3, 3'-ACTCTGT-5' at 919, 3'-ACTCTGT-5' at 1085, 3'-ACTCTGT-5' at 2029.
  16. inverse, positive strand, positive direction, looking for 3'-ACTCTGT-5', 0.

B recognition elements

The factor II B recognition element is BREu.

Negative strand in the negative direction there are 3: 3'-CCACGCC-5' at 380, 3'-CCGCGCC-5' at 1762, and 3'-CCACGCC-5' at 2197 the distal promoter.

Complement, negative strand, negative direction there us 1: 3'-CCTGCGG-5' at 1153.

Inverse complement, positive strand, negative direction there are 4: 3'-GGCGTGG-5' at 1244, 3'-GGCGCGG-5' at 1762, 3'-GGCGTGG-5' at 1897, and 3'-GGCGTGG-5' at 3047.

Negative strand in the positive direction there are 3: 3'-GCACGCC-5', 1302, 3'-GGACGCC-5', 1672, 3'-GGGCGCC-5', 1769.

Positive strand in the positive direction there are 3: 3'-CCACGCC-5', 489, 3'-CGACGCC-5', 1033, 3'-CCACGCC-5', 1764.

Inverse complement, negative strand, positive direction there is 1: 3'-GGCGCCC-5', 1770.

Inverse complement, positive strand, positive direction there is 4: 3'-GGCGCGC-5', 682, 3'-GGCGCCG-5', 1338, 3'-GGCGCCG-5', 1438, 3'-GGCGTGG-5', 2566.

CAAT boxes

There are no CAAT boxes in either promoter.

CAREs

A CARE occurs in the negative direction: 3'-CAACTC-5' at 86 possibly associated with ZSCAN22. But inverse CAREs occur 3'-CTCAAC-5' at 1406, 3'-CTCAAC-5' at 2592, 3'-CTCAAC-5' at 2704, 3'-CTCAAC-5' at 3115, and 3'-CTCAAC-5' at 4096.

A CARE occurs in the positive direction: 3'-CAACTC-5' at 3292 in the positive direction. But inverse CARE occur 3'-CTCAAC-5' at 1406 and 3'-CTCAAC-5' at 1621 and 3'-CTCAAC-5' at 3290.

CArG boxes

There is a more general CArG box, 3'-CATTAAAAGG-5', at 3441 from ZSCAN22, or -1019 nts from the TSS of A1BG in the negative direction on the positive strand in the distal promoter.

A second more general CArG box, 3'-CAAAAAAAAG-5', at 1399 from ZSCAN22, or -3061 nts from the A1BG TSS may be a CArG box for ZSCAN22 in the negative direction on the positive strand in the distal promoter.

C boxes

Proximal promoters

Inverse complement, negative strand, negative direction there is 1: 3'-ACATCA-5', 4124.

There is one C box 3'-ACATCA-5' at 4116 nts in the positive direction.

Distal promoters

There are four C boxes: 3'-AGTAGT-5' at 2888, 3'-AGTAGT-5' at 2944, 3'-AGTAGT-5' at 3418, and 3'-AGTAGT-5' at 3521 on the negative strand in the negative direction and its complement on the positive strand.

Inverse complement, negative strand, negative direction there are 2: 3'-ACATCA-5', 2340, 3'-ACATCA-5', 2541.

There is one complement C box: 3'-TCATCA-5' at 3251 on the negative strand in the positive direction and its complement on the positive strand.

Inverse, negative strand, positive direction, there is 1: 3'-TGATGA-5', 2144.

Positive strand in the positive direction there is 1: 3'-AGTAGT-5', 3251.

CENP-B boxes

There are no CENP-B boxes in either promoter.

CGCG boxes

Negative strand in the negative direction there are 2: 3'-GCGCGT-5', 161, 3'-CCGCGC-5', 1761, in the distal promoter.

Positive strand in the negative direction there is 1: 3'-GCGCGG-5', 1762, in the distal promoter.

Negative strand in the positive direction there are 8: 3'-GCGCGT-5', 543, 3'-CCGCGC-5', 681, 3'-GCGCGC-5', 683, 3'-ACGCGG-5', 871, 3'-ACGCGG-5', 971, 3'-CCGCGG-5', 1337, 3'-CCGCGG-5', 1437, 3'-CCGCGC-5', 1650, in the distal promoter.

Positive strand in the positive direction there are 22: 3'-CCGCGC-5', 161, 3'-ACGCGG-5', 452, 3'-CCGCGC-5', 542, 3'-GCGCGC-5', 682, 3'-GCGCGT-5', 684, 3'-CCGCGT-5', 876, 3'-CCGCGT-5', 976, 3'-CCGCGT-5', 1046, 3'-ACGCGG-5', 1078, 3'-ACGCGG-5', 1162, 3'-CCGCGC-5', 1214, 3'-ACGCGG-5', 1246, 3'-CCGCGT-5', 1298, 3'-ACGCGT-5', 1314, 3'-ACGCGG-5', 1354, 3'-ACGCGG-5', 1398, 3'-ACGCGT-5', 1414, 3'-ACGCGG-5', 1454, 3'-ACGCGG-5', 1498, 3'-ACGCGT-5', 1523, 3'-CCGCGT-5', 1550, 3'-CCGCGG-5', 1769, in the distal promoter.

CRE boxes

Negative strand in the negative direction there is 1: 3'-TGACGTCA-5', 4317, and its complement in the proximal promoter.

D boxes

There is one D box in the distal promoter: 3'-AGTCTG-5' at 2947 on the negative strand in the negative direction and its complement on the positive strand.

Positive strand in the negative direction there is 1: 3'-AGTCTG-5', 1355.

Inverse complement, positive strand, negative direction there are 2: 3'-CAGACT-5', 15, 3'-CAGACT-5', 1616.

There is one D box in the distal promoter: 3'-AGTCTG-5' at 3923 on the negative strand in the positive direction and its complement on the positive strand.

Inverse complement, negative strand, positive direction there are 2: 3'-CAGACT-5', 1744, 3'-CAGACT-5', 2416.

Inverse complement, positive strand, positive direction there are 3: 3'-CAGACT-5', 2943, 3'-CAGACT-5', 3006, 3'-CAGACT-5', 3924.

Downstream B recognition elements

  1. negative strand in the negative direction, looking for 3'-A/G-T-A/G/T-G/T-G/T-G/T-G/T-5', 59, 3'-ATTTTGT-5' at 68, 3'-ATATGTT-5' at 113, 3'-GTTTTGT-5' at 166, 3'-ATATTTT-5' at 183, 3'-ATATTTT-5' at 222, 3'-GTTTTGG-5' at 259, 3'-ATGTTTT-5' at 485, 3'-GTTTTTT-5' at 487, 3'-ATTGGGG-5' at 616, 3'-ATGTTTT-5' at 637, 3'-GTTTTTT-5' at 639, 3'-ATGTTTT-5' at 771, 3'-GTTTTTT-5' at 773, 3'-GTGTGGT-5' at 883, 3'-GTTTTTT-5' at 928, 3'-GTTTTTT-5' at 1094, 3'-ATGTTTT-5' at 1228, 3'-GTTTTTT-5' at 1230, 3'-GTTTTTG-5' at 1386, 3'-GTTTGTT-5' at 1392, 3'-GTTTTTT-5' at 1396, 3'-GTTGGGT-5' at 1409, 3'-GTTGGGT-5' at 1516, 3'-GTTTGTG-5' at 1540, 3'-ATGTTTT-5' at 1880, 3'-GTTTTTT-5' at 1882, 3'-GTTTTTT-5' at 2038, 3'-ATGTTTT-5' at 2182, 3'-GTTTTTT-5' at 2184, 3'-ATGTTTT-5' at 2307, 3'-GTTTTTT-5' at 2309, 3'-GTGTGGT-5' at 2419, 3'-GTTTGTT-5' at 2484, 3'-GTTTGTT-5' at 2488, 3'-ATATGTT-5' at 2642, 3'-ATGTTTT-5' at 2644, 3'-GTGGGGT-5' at 2764, 3'-GTTGGGT-5' at 2846, 3'-ATATTTG-5' at 2875, 3'-GTAGTTT-5' at 2890, 3'-ATTTTTT-5' at 3026, 3'-GTGGGTT-5' at 3136, 3'-ATTTTTG-5' at 3165, 3'-GTATTTT-5' at 3171, 3'-GTTTTTG-5' at 3328, 3'-ATTTGTT-5' at 3338, 3'-ATTTGGT-5' at 3365, 3'-ATTTGGT-5' at 3484, 3'-GTAGTTG-5' at 3523, 3'-ATGGTGG-5' at 3740, 3'-GTGTTTT-5' at 3767, 3'-ATGTTTT-5' at 4066, 3'-GTTTTTT-5' at 4068, 3'-GTTGTGT-5' at 4196, 3'-ATGTTTT-5' at 4216, 3'-GTTTTTT-5' at 4218, 3'-GTTTTTT-5' at 4378, 3'-GTGGGGT-5' at 4446, 3'-GTAGGTG-5' at 4458 and their complements.
  2. negative strand in the positive direction, looking for 3'-A/G-T-A/G/T-G/T-G/T-G/T-G/T-5', 11, 3'-GTGGGGG-5' at 56, 3'-ATTTTTT-5' at 2451, 3'-GTGTTGG-5' at 2816, 3'-ATGTTTG-5' at 3339, 3'-GTGGTGG-5' at 3816, 3'-GTGTGGT-5' at 3967, 3'-GTGGTGT-5' at 3969, 3'-GTGGTTT-5' at 4108, 3'-ATTGTTG-5' at 4173, 3'-ATGGGGG-5' at 4225, 3'-GTGGGGT-5' at 4397 and their complements.
  3. positive strand in the negative direction, looking for 3'-A/G-T-A/G/T-G/T-G/T-G/T-G/T-5', 31, 3'-ATATGTT-5' at 43, 3'-ATATGGG-5' at 78, 3'-ATGGGGT-5' at 204, 3'-ATGTTTT-5' at 215, 3'-ATATGGT-5' at 606, 3'-ATGGTGT-5' at 608, 3'-ATGTGGT-5' at 788, 3'-GTGGTGG-5' at 790, 3'-GTGGTGT-5' at 793, 3'-ATTGGGT-5' at 1047, 3'-GTGGGTG-5' at 1163, 3'-GTGGTGG-5' at 1247, 3'-GTGGTGT-5' at 1477, 3'-GTGGTGG-5' at 1900, 3'-GTGGTGG-5' at 1903, 3'-GTGGGTG-5' at 2332, 3'-GTGTGGT-5' at 2659, 3'-GTGGTGG-5' at 2661, 3'-ATATTTT-5' at 2853, 3'-GTGGTGG-5' at 3050, 3'-GTGTGGT-5' at 3187, 3'-GTGGTGG-5' at 3189, 3'-GTGGTGG-5' at 3192, 3'-GTGGGTG-5' at 3195, 3'-ATTGGTT-5' at 3531, 3'-GTGGTTG-5' at 3605, 3'-ATGGGGT-5' at 3802, 3'-ATGTGGT-5' at 3811, 3'-GTGTTGG-5' at 3942, 3'-GTTGGTT-5' at 3944, 3'-ATGGTGG-5' at 4110 and their complements.
  4. positive strand in the positive direction, looking for 3'-A/G-T-A/G/T-G/T-G/T-G/T-G/T-5', 19, 3'-GTGGGTG-5' at 72, 3'-GTAGGTG-5' at 631, 3'-GTAGGTG-5' at 700, 3'-GTGGTGG-5' at 704, 3'-ATGGGGT-5' at 1891, 3'-GTTGGGT-5' at 2015, 3'-GTGGGGG-5' at 2020, 3'-GTTGGTG-5' at 2122, 3'-ATATGGT-5' at 2591, 3'-ATGGTGT-5' at 2600, 3'-GTGTGGT-5' at 2603, 3'-ATGGTGG-5' at 2759, 3'-GTGTGGG-5' at 2965, 3'-ATAGGGT-5' at 3386, 3'-GTAGGGT-5' at 3631, 3'-GTGTGGT-5' at 3825, 3'-GTTTGTG-5' at 4257, 3'-GTGGGGT-5' at 4286, 3'-GTGGGGT-5' at 4328 and their complements.
  5. inverse, negative strand, negative direction, is SuccessablesdBREi--.bas, looking for 3'-G/T-G/T-G/T-G/T-A/G/T-T-A/G-5', 44, 3'-TTTGTTA-5' at 230, 3'-TTTTGTA-5' at 361, 3'-TTTTTTA-5' at 488, 3'-TTTTATG-5' at 633, 3'-TTTTATG-5' at 767, 3'-TGTGGTA-5' at 884, 3'-GGTTGTA-5' at 1205, 3'-TTTTTTA-5' at 1231, 3'-GTTTTTG-5' at 1386, 3'-GTTTGTG-5' at 1540, 3'-TTTTATG-5' at 1564, 3'-TTGTTTG-5' at 1587, 3'-TTTTATA-5' at 1740, 3'-TGGGGTA-5' at 1861, 3'-TTTTATG-5' at 1876, 3'-TTTTTTA-5' at 2061, 3'-GGTTGTA-5' at 2150, 3'-TTTTTTA-5' at 2185, 3'-TGGGGTA-5' at 2288, 3'-TTTTATG-5' at 2303, 3'-TGTGGTG-5' at 2420, 3'-TTGTTTG-5' at 2486, 3'-TTGTTTG-5' at 2511, 3'-GGTTGTG-5' at 2549, 3'-GGTTGTA-5' at 2612, 3'-GTTTTTA-5' at 2646, 3'-TTTGTTG-5' at 2843, 3'-TTTTATA-5' at 2869, 3'-TTTTTTA-5' at 2930, 3'-TTTGGTG-5' at 2972, 3'-TTTTTTG-5' at 3027, 3'-TGGGTTG-5' at 3137, 3'-TGGGGTA-5' at 3152, 3'-TTTTGTA-5' at 3167, 3'-GTTTTTG-5' at 3328, 3'-TTTGGTG-5' at 3366, 3'-TTTTGTG-5' at 3512, 3'-GTTGATA-5' at 3526, 3'-TGTTTTA-5' at 3768, 3'-GGGTATG-5' at 3857, 3'-GGTTGTG-5' at 3981, 3'-TTTTTTA-5' at 4069, 3'-TTTTTTA-5' at 4219, 3'-TTGGGTA-5' at 4454 and their complements.
  6. inverse, negative strand, positive direction, is SuccessablesdBREi-+.bas, looking for 3'-G/T-G/T-G/T-G/T-A/G/T-T-A/G-5', 16, 3'-GGGGATG-5' at 59, 3'-TGTTTTA-5' at 148, 3'-TTGGGTG-5' at 1802, 3'-TTTTTTG-5' at 2282, 3'-TGGGATG-5' at 2409, 3'-TTTTTTG-5' at 2452, 3'-GGGGATA-5' at 2659, 3'-GGTTTTG-5' at 2688, 3'-GTGGATG-5' at 2714, 3'-GGTGTTG-5' at 2815, 3'-GGTTATG-5' at 3026, 3'-TGTGGTG-5' at 3644, 3'-TTTGGTG-5' at 3949, 3'-TGTGGTG-5' at 3968, 3'-GGTTTTA-5' at 4110, 3'-TGGGGTG-5' at 4398 and their complements.
  7. inverse, positive strand, negative direction, is SuccessablesdBREi+-.bas, looking for 3'-G/T-G/T-G/T-G/T-A/G/T-T-A/G-5', 16, 3'-GTTTTTA-5' at 217, 3'-TGGTGTG-5' at 609, 3'-TGTGGTG-5' at 789, 3'-TTGGGTG-5' at 1048, 3'-GTGGGTG-5' at 1163, 3'-TTTTTTG-5' at 1433, 3'-TGGTGTG-5' at 1478, 3'-GGTGGTG-5' at 1902, 3'-GTGGGTG-5' at 2332, 3'-TGTGGTG-5' at 2660, 3'-GGGTGTG-5' at 3185, 3'-GGTTTTA-5' at 3350, 3'-TTGGTTG-5' at 3532, 3'-GTGGTTG-5' at 3605, 3'-GGTGATG-5' at 3798, 3'-TTGGTTG-5' at 3945 and their complements.
  8. inverse, positive strand, positive direction, is SuccessablesdBREi++.bas, looking for 3'-G/T-G/T-G/T-G/T-A/G/T-T-A/G-5', 14, 3'-GTGGGTG-5' at 72, 3'-GGTGGTG-5' at 703, 3'-TTGGATG-5' at 1283, 3'-TTGGGTG-5' at 2016, 3'-GTTGGTG-5' at 2122, 3'-TGGTGTG-5' at 2601, 3'-TTTGGTG-5' at 2633, 3'-TTGTGTG-5' at 3097, 3'-TGGTTTG-5' at 3176, 3'-TGTGGTA-5' at 3826, 3'-TGGGGTG-5' at 3941, 3'-TGGGGTA-5' at 4220, 3'-GTTTGTG-5' at 4257, 3'-TGGGGTG-5' at 4287 and their complements.

Downstream core elements

In the negative direction on the negative strand, the A1BG transcription start site is at 4460 nucleotides from the last nucleotide of the gene ZSCAN22. In the positive direction on the negative strand, the A1BG transcription start site is at 4300 from well within the gene ZNF497. Downstream core elements are expected downstream of these TSSs. Occurrences before the TSSs can be found on Downstream core element gene transcriptions.

  1. negative strand, negative direction, looking for DCE SI: 3'-CTTC-5', 0.
  2. negative strand, positive direction, looking for DCE SI: 3'-CTTC-5', 0.
  3. positive strand, negative direction, looking for DCE SI: 3'-CTTC-5' at 4528.
  4. positive strand, positive direction, looking for DCE SI: 3'-CTTC-5', 0.
  1. negative strand, negative direction, looking for DCE SII: 3'-CTGT-5', 2, 3'-CTGT-5' at 4468 , 3'-CTGT-5' at 4507.
  2. negative strand, positive direction, looking for DCE SII: 3'-CTGT-5', 1, 3'-CTGT-5' at 4392.
  3. positive strand, negative direction, looking for DCE SII: 3'-CTGT-5', 0.
  4. positive strand, positive direction, looking for DCE SII: 3'-CTGT-5', 1, 3'-CTGT-5' at 4332.
  1. negative strand, negative direction, looking for DCE SIII: 3'-AGC-5', 0.
  2. negative strand, positive direction, looking for DCE SIII: 3'-AGC-5', 1, 3'-AGC-5' at 4352.
  3. positive strand, negative direction, looking for DCE SIII: 3'-AGC-5', 3, 3'-AGC-5' at 4480, 3'-AGC-5' at 4489, 3'-AGC-5' at 4520.
  4. positive strand, positive direction, looking for DCE SIII: 3'-AGC-5', 1, 3'-AGC-5' at 4374.

Complements

  1. negative strand, negative direction, looking for DCE SIc: 3'-GAAG-5', 1, 3'-GAAG-5' at 4528.
  2. negative strand, positive direction, looking for DCE SIc: 3'-GAAG-5', 0.
  3. positive strand, negative direction, looking for DCE SIc: 3'-GAAG-5', 0.
  4. positive strand, positive direction, looking for DCE SIc: 3'-GAAG-5', 0.
  1. negative strand, negative direction, looking for DCE SIIc: 3'-GACA-5', 0.
  2. negative strand, positive direction, looking for DCE SIIc: 3'-GACA-5', 1, 3'-GACA-5' at 4332.
  3. positive strand, negative direction, looking for DCE SIIc: 3'-GACA-5', 2, 3'-GACA-5' at 4468, 3'-GACA-5' at 4507.
  4. positive strand, positive direction, looking for DCE SIIc: 3'-GAAG-5', 1, 3'-GACA-5' at 4392.
  1. negative strand, negative direction, looking for DCE SIIIc: 3'-TCG-5', 3, 3'-TCG-5' at 4480, 3'-TCG-5' at 4489, 3'-TCG-5' at 4520.
  2. negative strand, positive direction, looking for DCE SIIIc: 3'-TCG-5', 1, 3'-TCG-5' at 4374.
  3. positive strand, negative direction, looking for DCE SIIIc: 3'-TCG-5', 0.
  4. positive strand, positive direction, looking for DCE SIIIc: 3'-TCG-5', 1, 3'-TCG-5' at 4352.

Inverse complements

  1. looking for DCE SIci: 3'-GAAG-5', same as the complements.
  1. negative strand, negative direction, looking for DCE SIIci: 3'-ACAG-5', 0.
  2. negative strand, positive direction, looking for DCE SIIci: 3'-ACAG-5', 0.
  3. positive strand, negative direction, looking for DCE SIIci: 3'-ACAG-5', 1, 3'-ACAG-5' at 4517.
  4. positive strand, positive direction, looking for DCE SIIci: 3'-ACAG-5', 1, 3'-ACAG-5' at 4366.
  1. negative strand, negative direction, looking for DCE SIIIci: 3'-GCT-5', 1, 3'-GCT-5' at 4471.
  2. negative strand, positive direction, looking for DCE SIIIci: 3'-GCT-5', 4, 3'-GCT-5' at 4312, 3'-GCT-5' at 4321, 3'-GCT-5' at 4372, 3'-GCT-5' at 4390.
  3. positive strand, negative direction, looking for DCE SIIIci: 3'-GCT-5', 0.
  4. positive strand, positive direction, looking for DCE SIIIci: 3'-GCT-5', 1, 3'-GCT-5' at 4356.

Inverses

  1. looking for DCE SIi: 3'-CTTC-5', same as the direct transcript.
  1. negative strand, negative direction, looking for DCE SIIi: 3'-TGTC-5', 1, 3'-TGTC-5' at 4517.
  2. negative strand, positive direction, looking for DCE SIIi: 3'-TGTC-5', 1, 3'-TGTC-5' at 4366.
  3. positive strand, negative direction, looking for DCE SIIi: 3'-TGTC-5', 0.
  4. positive strand, positive direction, looking for DCE SIIi: 3'-TGTC-5', 0.
  1. negative strand, negative direction, looking for DCE SIIIi: 3'-CGA-5', 0.
  2. negative strand, positive direction, looking for DCE SIIIi: 3'-CGA-5', 1, 3'-CGA-5' at 4356.
  3. positive strand, negative direction, looking for DCE SIIIi: 3'-CGA-5', 1, 3'-CGA-5' at 4471.
  4. positive strand, positive direction, looking for DCE SIIIi: 3'-CGA-5', 4, 3'-CGA-5' at 4312, 3'-CGA-5' at 4321, 3'-CGA-5' at 4372, 3'-CGA-5' at 4390.

Downstream promoter elements

  1. negative strand in the negative direction (from ZSCAN22 to A1BG) is SuccessablesDPE--.bas, looking for 3'-A/G-G-A/T-C/T-A/C/G-5', 163, 3'-GGTCG-5', 35, 3'-AGATA-5', 234, 3'-GGTCC-5', 262, 3'-GGACA-5', 394, 3'-GGTCG-5', 403, 3'-GGTTC-5', 419, 3'-AGTCC-5', 441, 3'-GGACC-5', 459, 3'-AGATG-5', 481, 3'-GGTCG-5', 504, 3'-GGACC-5', 508, 3'-GGTCG-5', 540, 3'-GGTTC-5', 556, 3'-AGTCC-5', 578, 3'-GGACC-5', 596, 3'-AGATG-5', 624, 3'-GGTCC-5', 648, 3'-GGACA-5', 667, 3'-GGTCG-5', 676, 3'-GGTTC-5', 692, 3'-AGTCC-5', 714, 3'-GGTCG-5', 728, 3'-GGTCG-5', 737, 3'-AGATG-5', 758, 3'-GGACA-5', 801, 3'-GGTCG-5', 810, 3'-GGTCC-5', 850, 3'-GGTTC-5', 874, 3'-GGTCG-5', 895, 3'-GGACC-5', 899, 3'-AGACA-5', 919, 3'-GGTCC-5', 948, 3'-GGACA-5', 967, 3'-GGTCG-5', 976, 3'-AGTCC-5', 984, 3'-GGACC-5', 1015, 3'-GGTCG-5', 1061, 3'-AGACA-5', 1085, 3'-GGACA-5', 1131, 3'-GGTCG-5', 1140, 3'-GGTCG-5', 1194, 3'-GGACC-5', 1198, 3'-GGTTG-5', 1203, 3'-AGATG-5', 1224, 3'-GGACA-5', 1258, 3'-GGTCG-5', 1267, 3'-AGTCC-5', 1275, 3'-GGATC-5', 1306, 3'-GGTCA-5', 1352, 3'-AGACC-5', 1356, 3'-AGTTG-5', 1406, 3'-AGACA-5', 1452, 3'-GGTCC-5', 1460, 3'-AGTCG-5', 1486, 3'-AGTTG-5', 1513, 3'-AGATA-5', 1525, 3'-GGTCA-5', 1532, 3'-GGTCG-5', 1611, 3'-AGACA-5', 1776, 3'-GGTCG-5', 1785, 3'-GGTTC-5', 1817, 3'-GGACC-5', 1841, 3'-AGATG-5', 1867, 3'-GGACA-5', 1911, 3'-GGTCG-5', 1920, 3'-GGACC-5', 1959, 3'-GGTCG-5', 2005, 3'-GGACC-5', 2009, 3'-AGACA-5', 2029, 3'-GGTCC-5', 2077, 3'-GGATC-5', 2093, 3'-AGTCC-5', 2134, 3'-GGTTG-5', 2148, 3'-AGATG-5', 2169, 3'-GGTCA-5', 2211, 3'-AGTCC-5', 2250, 3'-GGTCG-5', 2264, 3'-GGACC-5', 2268, 3'-AGATG-5', 2294, 3'-GGACA-5', 2337, 3'-GGTCG-5', 2346, 3'-GGACC-5', 2385, 3'-GGTCG-5', 2431, 3'-GGACC-5', 2435, 3'-AGTTA-5', 2496, 3'-GGTCC-5', 2519, 3'-GGACA-5', 2538, 3'-GGTTG-5', 2547, 3'-AGTCC-5', 2587, 3'-GGTCA-5', 2601, 3'-GGTTG-5', 2610, 3'-AGTCG-5', 2650, 3'-GGTCA-5', 2654, 3'-GGACA-5', 2672, 3'-GGTCG-5', 2681, 3'-GGACC-5', 2720, 3'-GGTCG-5', 2766, 3'-GGACC-5', 2770, 3'-GGTTA-5', 2848, 3'-AGATG-5', 2988, 3'-GGATA-5', 2996, 3'-GGACA-5', 3061, 3'-GGTCG-5', 3070, 3'-AGTCC-5', 3110, 3'-GGTCG-5', 3124, 3'-GGACC-5', 3128, 3'-GGTTG-5', 3137, 3'-AGATG-5', 3158, 3'-GGACA-5', 3200, 3'-AGTCG-5', 3204, 3'-GGTCG-5', 3209, 3'-AGTCC-5', 3217, 3'-GGTCC-5', 3249, 3'-GGTTC-5', 3273, 3'-GGTCG-5', 3294, 3'-GGACC-5', 3298, 3'-AGACA-5', 3319, 3'-AGTCC-5', 3396, 3'-AGTTG-5', 3523, 3'-AGACA-5', 3556, 3'-GGTCC-5', 3564, 3'-GGACG-5', 3579, 3'-GGTCC-5', 3585, 3'-GGTCG-5', 3682, 3'-GGTCG-5', 3701, 3'-AGACG-5', 3706, 3'-GGTCG-5', 3731, 3'-GGACC-5', 3744, 3'-AGACC-5', 3835, 3'-AGTTC-5', 3844, 3'-GGACG-5', 3861, 3'-GGTCC-5', 3871, 3'-GGTCC-5', 3885, 3'-GGACC-5', 3906, 3'-GGTCC-5', 3951, 3'-GGACA-5', 3970, 3'-GGTTG-5', 3979, 3'-GGTTC-5', 4019, 3'-AGTTC-5', 4027, 3'-GGTCG-5', 4033, 3'-GGACC-5', 4037, 3'-AGATG-5', 4062, 3'-GGTCC-5', 4102, 3'-GGACA-5', 4121, 3'-GGTCG-5', 4130, 3'-AGTCC-5', 4138, 3'-GGTCC-5', 4170, 3'-AGTTC-5', 4178, 3'-GGACA-5', 4208, 3'-AGATG-5', 4212, 3'-GGTCC-5', 4253, 3'-GGTCG-5', 4261, 3'-GGACC-5', 4300, 3'-GGTCG-5', 4345, 3'-GGACC-5', 4349, 3'-GGACA-5', 4369, 3'-GGTCA-5', 4415, 3'-AGATG-5', 4430, 3'-AGTCC-5', 4436, 3'-GGTCG-5', 4480, 3'-AGTCG-5', 4489, 3'-GGACC-5', 4494, 3'-GGACC-5', 4546, and their complements.
  2. negative strand in the positive direction (from ZNF497 to A1BG) is SuccessablesDPE-+.bas, looking for 3'-A/G-G-A/T-C/T-A/C/G-5', 73, 3'-GGACC-5' at 37, 3'-GGATG-5' at 59, 3'-GGTCA-5' at 153, 3'-AGATG-5' at 166, 3'-AGTCC-5' at 172, 3'-GGACC-5' at 187, 3'-GGTCC-5' at 218, 3'-GGTTC-5' at 305, 3'-GGACG-5' at 323, 3'-GGACG-5' at 359, 3'-AGACG-5' at 398, 3'-GGACG-5' at 410, 3'-AGACC-5' at 440, 3'-AGACA-5' at 712, 3'-AGTCC-5' at 757, 3'-AGATC-5' at 864, 3'-AGATC-5' at 964, 3'-AGTCG-5' at 1528, 3'-GGACG-5' at 1670, 3'-GGTCG-5' at 1687, 3'-GGACA-5' at 1693, 3'-AGTCC-5' at 1826, 3'-AGTCC-5' at 1841, 3'-GGACA-5' at 1869, 3'-GGATG-5' at 1878, 3'-GGTTC-5' at 1926, 3'-AGTTC-5' at 1987, 3'-AGTCC-5' at 2026, 3'-GGTCA-5' at 2035, 3'-AGTCA-5' at 2100, 3'-AGTTA-5' at 2134, 3'-GGTCA-5' at 2220, 3'-AGATC-5' at 2230, 3'-GGATG-5' at 2409, 3'-GGACA-5' at 2460, 3'-AGTCA-5' at 2607, 3'-AGTCA-5' at 2613, 3'-AGTCA-5' at 2618, 3'-GGATA-5' at 2659, 3'-AGTTA-5' at 2666, 3'-GGATG-5' at 2714, 3'-GGATA-5' at 2737, 3'-AGACC-5' at 2861, 3'-GGTTC-5' at 2922, 3'-AGTTC-5' at 2954, 3'-AGTCC-5' at 2998, 3'-GGTTA-5' at 3024, 3'-GGTTG-5' at 3050, 3'-AGTCC-5' at 3084, 3'-GGACA-5' at 3131, 3'-GGACC-5' at 3172, 3'-AGTCG-5' at 3283, 3'-AGTTA-5' at 3381, 3'-AGATG-5' at 3418, 3'-GGATG-5' at 3457, 3'-AGATG-5' at 3475, 3'-GGTTG-5' at 3490, 3'-GGACA-5' at 3530, 3'-GGACC-5' at 3545, 3'-AGACC-5' at 3550, 3'-GGATG-5' at 3574, 3'-GGTCA-5' at 3820, 3'-AGTCC-5' at 3863, 3'-AGACA-5' at 3893, 3'-GGTTC-5' at 4073, 3'-GGATC-5' at 4080, 3'-GGATG-5' at 4099, 3'-AGTTC-5' at 4200, 3'-GGACA-5' at 4252, 3'-GGTCA-5' at 4269, 3'-AGACG-5' at 4319, 3'-AGACA-5' at 4332, 3'-GGTCC-5' at 4420, and their complements.
  3. positive strand in the negative direction is SuccessablesDPE+-.bas, looking for 3'-A/G-G-A/T-C/T-A/C/G-5', 101, 3'-GGACC-5', 32, 3'-AGATA-5', 57, 3'-GGATA-5', 74, 3'-AGTTG-5', 84, 3'-GGATA-5', 98, 3'-GGATA-5', 108, 3'-AGTCG-5', 157, 3'-AGACA-5', 170, 3'-GGTCA-5', 206, 3'-AGATG-5', 244, 3'-AGTTC-5', 253, 3'-AGACA-5', 422, 3'-GGATC-5', 430, 3'-GGTCA-5', 439, 3'-GGATC-5', 525, 3'-AGACA-5', 559, 3'-GGTCA-5', 568, 3'-GGTCA-5', 576, 3'-AGATC-5', 589, 3'-GGATC-5', 703, 3'-GGTCA-5', 712, 3'-AGTTC-5', 719, 3'-AGACC-5', 725, 3'-GGATG-5', 784, 3'-GGTTG-5', 862, 3'-AGATC-5', 877, 3'-AGATC-5', 972, 3'-GGTTG-5', 1028, 3'-GGACG-5', 1151, 3'-GGATC-5', 1167, 3'-AGTTC-5', 1177, 3'-GGTTG-5', 1319, 3'-AGATG-5', 1438, 3'-AGACA-5', 1569, 3'-AGATA-5', 1595, 3'-GGATC-5', 1812, 3'-AGATG-5', 1828, 3'-AGACC-5', 1834, 3'-AGATC-5', 1987, 3'-GGACA-5', 2117, 3'-AGACC-5', 2121, 3'-AGACC-5', 2145, 3'-AGATA-5', 2177, 3'-GGTTG-5', 2234, 3'-GGATC-5', 2239, 3'-GGTCA-5', 2248, 3'-AGACC-5', 2261, 3'-GGACA-5', 2271, 3'-GGTTG-5', 2398, 3'-AGATC-5', 2413, 3'-AGTCC-5', 2543, 3'-GGATC-5', 2574, 3'-GGTCA-5', 2585, 3'-AGTTG-5', 2592, 3'-AGACC-5', 2598, 3'-AGTTG-5', 2704, 3'-AGTTG-5', 2733, 3'-AGACA-5', 2880, 3'-AGATG-5', 2894, 3'-AGATG-5', 2905, 3'-AGACA-5', 2948, 3'-AGATA-5', 2981, 3'-GGATC-5', 3097, 3'-AGTTG-5', 3115, 3'-AGACC-5', 3121, 3'-GGTTG-5', 3261, 3'-AGATC-5', 3276, 3'-GGACA-5', 3389, 3'-AGACA-5', 3433, 3'-AGATA-5', 3465, 3'-AGATC-5', 3488, 3'-GGTTG-5', 3532, 3'-GGTTG-5', 3605, 3'-AGATG-5', 3620, 3'-AGATG-5', 3627, 3'-GGATA-5', 3655, 3'-GGACA-5', 3756, 3'-AGACC-5', 3761, 3'-GGTTG-5', 3804, 3'-GGTCG-5', 3813, 3'-GGACC-5', 3868, 3'-AGATG-5', 3919, 3'-GGTTG-5', 3945, 3'-GGATC-5', 4006, 3'-AGTTC-5', 4024, 3'-AGACC-5', 4030, 3'-AGTTG-5', 4096, 3'-AGTCC-5', 4126, 3'-GGATC-5', 4157, 3'-AGTTC-5', 4175, 3'-AGACA-5', 4181, 3'-AGACC-5', 4204, 3'-AGACG-5', 4235, 3'-GGATC-5', 4288, 3'-GGTCA-5', 4307, 3'-AGACC-5', 4365, 3'-AGTTC-5', 4417, 3'-GGACA-5', 4468, 3'-AGATC-5', 4475, 3'-AGTCC-5', 4500, 3'-AGACA-5', 4507, and their complements.
  4. positive strand in the positive direction is SuccessablesDPE++.bas, looking for 3'-A/G-G-A/T-C/T-A/C/G-5', 159, 3'-GGTCC-5' at 8, 3'-GGTCC-5' at 33, 3'-GGACC-5' at 40, 3'-AGTCC-5' at 90, 3'-AGACA-5' at 98, 3'-AGACC-5' at 102, 3'-GGACA-5' at 144, 3'-GGTTC-5' at 177, 3'-GGACG-5' at 191, 3'-GGTCC-5' at 215, 3'-AGACG-5' at 223, 3'-AGACC-5' at 270, 3'-GGACC-5' at 286, 3'-GGTCG-5' at 329, 3'-GGTCC-5' at 424, 3'-GGACG-5' at 435, 3'-AGTCG-5' at 511, 3'-GGTCC-5' at 515, 3'-GGACC-5' at 598, 3'-GGTTG-5' at 607, 3'-AGTCG-5' at 613, 3'-GGTCG-5' at 617, 3'-GGTCG-5' at 623, 3'-GGATG-5' at 649, 3'-GGTCC-5' at 707, 3'-GGACG-5' at 807, 3'-AGTCG-5' at 831, 3'-GGTTG-5' at 843, 3'-GGACC-5' at 847, 3'-GGACA-5' at 891, 3'-GGACG-5' at 907, 3'-AGTCG-5' at 931, 3'-GGTTG-5' at 943, 3'-GGACC-5' at 947, 3'-GGACA-5' at 991, 3'-GGACG-5' at 1075, 3'-GGACG-5' at 1118, 3'-GGTCG-5' at 1127, 3'-GGTCC-5' at 1175, 3'-GGATG-5' at 1195, 3'-GGACC-5' at 1199, 3'-GGTCA-5' at 1250, 3'-AGTCG-5' at 1267, 3'-GGTCG-5' at 1271, 3'-GGTTG-5' at 1279, 3'-GGATG-5' at 1283, 3'-GGACG-5' at 1311, 3'-GGTCG-5' at 1357, 3'-GGTCG-5' at 1363, 3'-GGACG-5' at 1369, 3'-AGACC-5' at 1376, 3'-AGACG-5' at 1395, 3'-GGACG-5' at 1411, 3'-GGTCG-5' at 1457, 3'-GGTCG-5' at 1463, 3'-GGACG-5' at 1469, 3'-AGACC-5' at 1476, 3'-AGACG-5' at 1495, 3'-GGATG-5' at 1573, 3'-AGTCG-5' at 1603, 3'-AGTTG-5' at 1621, 3'-AGACG-5' at 1733, 3'-GGACG-5' at 1776, 3'-GGACC-5' at 1815, 3'-GGTCC-5' at 1855, 3'-GGACA-5' at 1860, 3'-AGACC-5' at 1864, 3'-GGTCC-5' at 1893, 3'-AGACC-5' at 1992, 3'-GGTTG-5' at 2012, 3'-GGTCA-5' at 2024, 3'-GGTCG-5' at 2052, 3'-AGTCA-5' at 2060, 3'-AGTCA-5' at 2098, 3'-AGTCG-5' at 2102, 3'-AGTCC-5' at 2115, 3'-AGATC-5' at 2167, 3'-AGACA-5' at 2182, 3'-AGTCG-5' at 2198, 3'-AGTTA-5' at 2233, 3'-GGACA-5' at 2250, 3'-AGACA-5' at 2260, 3'-AGACA-5' at 2308, 3'-GGTCC-5' at 2316, 3'-AGTCC-5' at 2372, 3'-AGTCG-5' at 2390, 3'-GGTTC-5' at 2398, 3'-GGACC-5' at 2433, 3'-GGATC-5' at 2481, 3'-GGACC-5' at 2501, 3'-AGTTC-5' at 2508, 3'-GGACG-5' at 2520, 3'-AGTCG-5' at 2526, 3'-GGACC-5' at 2569, 3'-GGTCC-5' at 2574, 3'-GGTTC-5' at 2593, 3'-GGTCA-5' at 2605, 3'-AGTTC-5' at 2615, 3'-AGTCC-5' at 2620, 3'-GGTCC-5' at 2780, 3'-AGACG-5' at 2856, 3'-GGTCC-5' at 2876, 3'-AGACC-5' at 2883, 3'-GGACC-5' at 2891, 3'-GGTTA-5' at 2908, 3'-AGACA-5' at 2925, 3'-AGTCA-5' at 2936, 3'-AGACA-5' at 2957, 3'-AGACG-5' at 2975, 3'-AGACC-5' at 2983, 3'-GGACC-5' at 2988, 3'-GGTCA-5' at 2996, 3'-GGTCC-5' at 3016, 3'-AGACC-5' at 3021, 3'-AGTCC-5' at 3034, 3'-AGTCG-5' at 3041, 3'-GGACC-5' at 3047, 3'-AGACG-5' at 3060, 3'-GGTCA-5' at 3082, 3'-GGTCC-5' at 3111, 3'-AGTCG-5' at 3155, 3'-GGTCG-5' at 3239, 3'-AGATA-5' at 3258, 3'-AGACG-5' at 3267, 3'-AGACG-5' at 3278, 3'-AGTTG-5' at 3290, 3'-GGACC-5' at 3296, 3'-AGACG-5' at 3306, 3'-AGACG-5' at 3358, 3'-GGACC-5' at 3362, 3'-GGTCA-5' at 3379, 3'-AGACC-5' at 3405, 3'-AGTTA-5' at 3424, 3'-GGACA-5' at 3434, 3'-GGACC-5' at 3496, 3'-GGTCC-5' at 3536, 3'-GGACA-5' at 3617, 3'-GGACA-5' at 3622, 3'-GGTTG-5' at 3633, 3'-GGACC-5' at 3679, 3'-GGTCC-5' at 3687, 3'-GGTCG-5' at 3720, 3'-AGTCC-5' at 3728, 3'-GGACC-5' at 3758, 3'-AGTCG-5' at 3775, 3'-GGACC-5' at 3787, 3'-GGTCA-5' at 3841, 3'-AGTCC-5' at 3868, 3'-AGTCG-5' at 3997, 3'-AGTCG-5' at 4023, 3'-GGTCC-5' at 4032, 3'-AGTCG-5' at 4052, 3'-AGATC-5' at 4064, 3'-AGATC-5' at 4076, 3'-GGACG-5' at 4231, 3'-AGTCA-5' at 4271, 3'-GGACC-5' at 4409, 3'-AGACC-5' at 4416, 3'-GGACC-5' at 4424, and their complements.
  5. inverse, negative strand, negative direction, is SuccessablesDPEi--.bas, looking for 3'-A/C/G-C/T-A/T-G-A/G-5', 58, 3'-CCTGG-5', 32, 3'-ACAGA-5', 479, 3'-GTAGG-5', 593, 3'-ATTGG-5', 614, 3'-ACTGG-5', 734, 3'-GCAGA-5', 754, 3'-CTTGG-5', 846, 3'-ACAGA-5', 921, 3'-CTAGG-5', 973, 3'-CTTGG-5', 1012, 3'-ACTGA-5', 1051, 3'-ACAGA-5', 1087, 3'-GCTGG-5', 1191, 3'-ACAGA-5', 1222, 3'-CTTGG-5', 1303, 3'-GTTGG-5', 1407, 3'-CTAGA-5', 1482, 3'-GTTGG-5', 1514, 3'-ATAGG-5', 1529, 3'-GTAGG-5', 1572, 3'-CTTGA-5', 1685, 3'-ATAGA-5', 1731, 3'-GCAGA-5', 1774, 3'-CTAGG-5', 1813, 3'-GTAGG-5', 1838, 3'-GTAGA-5', 1863, 3'-CTTGG-5', 1956, 3'-ACAGA-5', 2031, 3'-ACAGA-5', 2165, 3'-ACTGG-5', 2189, 3'-GTAGA-5', 2290, 3'-CTTGG-5', 2382, 3'-CTTGG-5', 2717, 3'-ACTGA-5', 2786, 3'-GTTGG-5', 2844, 3'-GTTGA-5', 2911, 3'-ACAGA-5', 2986, 3'-GTAGA-5', 3154, 3'-CTTGG-5', 3245, 3'-ACAGA-5', 3321, 3'-CTTGA-5', 3460, 3'-GTTGA-5', 3524, 3'-GTAGA-5', 3551, 3'-CCTGA-5', 3640, 3'-GCAGG-5', 3698, 3'-CCTGA-5', 3747, 3'-CTTGG-5', 3784, 3'-ACAGA-5', 3833, 3'-GTTGA-5', 3849, 3'-CCTGG-5', 3868, 3'-GTAGG-5', 3903, 3'-GTAGA-5', 4058, 3'-ACAGA-5', 4210, 3'-CCTGA-5', 4327, 3'-ACAGA-5', 4371, 3'-CTTGG-5', 4451, 3'-GTAGG-5', 4456, 3'-CTAGG-5', 4476,
  6. inverse, negative strand, positive direction, is SuccessablesDPEi-+.bas, looking for 3'-A/C/G-C/T-A/T-G-A/G-5', 152 , 3'-CCAGG-5' at 8 , 3'-CCAGA-5' at 15 , 3'-ATTGG-5' at 24 , 3'-CCAGG-5' at 33 , 3'-CCTGG-5' at 40 , 3'-ACAGG-5' at 157 , 3'-GCAGG-5' at 194 , 3'-CCAGA-5' at 204 , 3'-CCAGG-5' at 215 , 3'-GCTGG-5' at 277 , 3'-CCTGG-5' at 286 , 3'-GCAGG-5' at 318 , 3'-ACTGG-5' at 347 , 3'-ACAGG-5' at 365 , 3'-GCAGG-5' at 379 , 3'-GCTGG-5' at 386 , 3'-GCAGA-5' at 396 , 3'-GCTGG-5' at 417 , 3'-CCAGG-5' at 424 , 3'-GCAGA-5' at 438 , 3'-CCAGA-5' at 468 , 3'-CCAGG-5' at 515 , 3'-ACAGG-5' at 552 , 3'-CCTGG-5' at 598 , 3'-GCAGG-5' at 658 , 3'-CCAGG-5' at 707 , 3'-CCTGA-5' at 725 , 3'-GCTGG-5' at 779 , 3'-CCAGA-5' at 835 , 3'-CCTGG-5' at 847 , 3'-CCTGA-5' at 859 , 3'-CCAGA-5' at 935 , 3'-CCTGG-5' at 947 , 3'-CCTGA-5' at 959 , 3'-ACTGG-5' at 1140 , 3'-CCAGG-5' at 1175 , 3'-CCTGG-5' at 1199 , 3'-ACTGA-5' at 1286 , 3'-GCAGA-5' at 1316 , 3'-GCAGA-5' at 1416 , 3'-CCAGA-5' at 1631 , 3'-CCTGA-5' at 1660 , 3'-CCTGA-5' at 1676 , 3'-GCTGG-5' at 1736 , 3'-CCAGA-5' at 1742 , 3'-GCTGG-5' at 1779 , 3'-GCAGG-5' at 1788 , 3'-CTTGG-5' at 1799 , 3'-CCTGG-5' at 1815 , 3'-CCAGG-5' at 1855 , 3'-GTAGG-5' at 1875 , 3'-CCAGG-5' at 1893 , 3'-GCAGG-5' at 1905 , 3'-GCAGA-5' at 1937 , 3'-ACTGG-5' at 1953 , 3'-ACAGG-5' at 1966 , 3'-ACAGG-5' at 2125 , 3'-GTTGG-5' at 2185 , 3'-CCTGA-5' at 2211 , 3'-CCAGA-5' at 2228 , 3'-GTAGG-5' at 2255 , 3'-GCAGG-5' at 2296 , 3'-CCAGG-5' at 2316 , 3'-GCTGG-5' at 2320 , 3'-GCTGG-5' at 2405 , 3'-ACAGA-5' at 2414 , 3'-CCTGG-5' at 2433 , 3'-CTAGG-5' at 2482 , 3'-CCTGG-5' at 2501 , 3'-GTTGG-5' at 2541 , 3'-ATAGG-5' at 2550 , 3'-CCTGG-5' at 2569 , 3'-CCAGG-5' at 2574 , 3'-ATAGA-5' at 2627 , 3'-CTAGG-5' at 2639 , 3'-ACAGA-5' at 2652 , 3'-ACTGA-5' at 2674 , 3'-GCAGG-5' at 2683 , 3'-GCAGA-5' at 2721 , 3'-GCTGG-5' at 2734 , 3'-GCAGG-5' at 2745 , 3'-GCTGG-5' at 2770 , 3'-CCAGG-5' at 2780 , 3'-GCTGG-5' at 2810 , 3'-GTTGG-5' at 2816 , 3'-ACAGA-5' at 2837 , 3'-GCAGA-5' at 2859 , 3'-CCAGG-5' at 2876 , 3'-CCTGG-5' at 2891 , 3'-GCTGA-5' at 2915 , 3'-CCTGA-5' at 2968 , 3'-CCTGG-5' at 2988 , 3'-CCAGG-5' at 3016 , 3'-CCTGG-5' at 3047 , 3'-CCAGA-5' at 3091 , 3'-CCAGG-5' at 3111 , 3'-ACTGG-5' at 3117 , 3'-GCAGG-5' at 3128 , 3'-ACAGA-5' at 3133 , 3'-GCAGG-5' at 3147 , 3'-ACAGA-5' at 3179 , 3'-GCAGA-5' at 3214 , 3'-CCAGA-5' at 3221 , 3'-GCTGG-5' at 3242 , 3'-CCTGG-5' at 3296 , 3'-ACTGG-5' at 3345 , 3'-CCTGG-5' at 3362 , 3'-GTAGA-5' at 3416 , 3'-GCAGG-5' at 3466 , 3'-GCAGA-5' at 3473 , 3'-CTAGG-5' at 3484 , 3'-CCTGG-5' at 3496 , 3'-CTAGG-5' at 3522 , 3'-GCTGG-5' at 3526 , 3'-CCAGG-5' at 3536 , 3'-CCAGA-5' at 3548 , 3'-ACAGG-5' at 3571 , 3'-GCTGA-5' at 3588 , 3'-ACAGG-5' at 3636 , 3'-GCAGG-5' at 3662 , 3'-CCTGG-5' at 3679 , 3'-CCAGG-5' at 3687 , 3'-GCAGG-5' at 3694 , 3'-ACTGA-5' at 3735 , 3'-GTAGG-5' at 3753 , 3'-CCTGG-5' at 3758 , 3'-GCAGG-5' at 3768 , 3'-GCTGA-5' at 3778 , 3'-CCTGG-5' at 3787 , 3'-CCAGA-5' at 3806 , 3'-GCAGA-5' at 3831 , 3'-CCAGA-5' at 3891 , 3'-GCAGA-5' at 3916 , 3'-ACAGG-5' at 3975 , 3'-GCTGG-5' at 3989 , 3'-CTTGA-5' at 4016 , 3'-CCAGG-5' at 4032 , 3'-GTAGA-5' at 4036 , 3'-CTAGA-5' at 4065 , 3'-ACAGG-5' at 4070 , 3'-CTAGG-5' at 4077 , 3'-ACTGA-5' at 4089 , 3'-CTTGA-5' at 4131 , 3'-ATTGA-5' at 4161 , 3'-GCTGG-5' at 4177 , 3'-CCTGA-5' at 4186 , 3'-CCTGA-5' at 4214 , 3'-CTTGG-5' at 4300 , 3'-GCAGA-5' at 4317 , 3'-CCAGA-5' at 4330 , 3'-CCTGG-5' at 4409 , 3'-CCTGG-5' at 4424.
  7. inverse, positive strand, negative direction, is SuccessablesDPEi+-.bas, looking for 3'-A/C/G-C/T-A/T-G-A/G-5', 174, 3'-ACAGA-5', 13, 3'-ACTGA-5', 17, 3'-GTTGA-5', 85, 3'-ATAGA-5', 100, 3'-GTAGG-5', 119, 3'-ACTGA-5', 130, 3'-GCTGA-5', 140, 3'-ACAGA-5', 168, 3'-CCAGG-5', 262, 3'-GTAGA-5', 284, 3'-ACAGA-5', 289, 3'-ACTGA-5', 307, 3'-CTTGG-5', 328, 3'-ATAGA-5', 355, 3'-ACAGG-5', 424, 3'-CCTGG-5', 459, 3'-CCTGG-5', 508, 3'-ACAGG-5', 561, 3'-GCAGG-5', 565, 3'-ATTGA-5', 585, 3'-CCTGG-5', 596, 3'-ATTGG-5', 643, 3'-CCAGG-5', 648, 3'-GCAGG-5', 697, 3'-CCTGA-5', 732, 3'-GCTGG-5', 781, 3'-GCTGA-5', 825, 3'-GCAGG-5', 831, 3'-CTTGA-5', 843, 3'-CCAGG-5', 850, 3'-CCTGG-5', 899, 3'-ACAGA-5', 907, 3'-CCAGG-5', 948, 3'-GTAGA-5', 970, 3'-GCTGA-5', 991, 3'-GCAGG-5', 997, 3'-CTTGA-5', 1009, 3'-CCTGG-5', 1015, 3'-GCAGA-5', 1023, 3'-ATTGG-5', 1045, 3'-ACAGA-5', 1073, 3'-GCTGG-5', 1111, 3'-CCTGA-5', 1173, 3'-CCTGG-5', 1198, 3'-GCTGA-5', 1282, 3'-GCAGG-5', 1288, 3'-CTTGA-5', 1300, 3'-CTAGG-5', 1307, 3'-GCAGA-5', 1314, 3'-CCAGA-5', 1411, 3'-CCAGG-5', 1460, 3'-GCTGG-5', 1464, 3'-CCAGA-5', 1518, 3'-ACAGA-5', 1567, 3'-GCAGA-5', 1614, 3'-CCTGA-5', 1623, 3'-CTTGG-5', 1649, 3'-GTAGA-5', 1653, 3'-CCAGA-5', 1670, 3'-ATAGA-5', 1710, 3'-GCTGG-5', 1746, 3'-GCTGG-5', 1756, 3'-GCTGA-5', 1800, 3'-GCAGG-5', 1823, 3'-CCTGG-5', 1841, 3'-GTTGA-5', 1853, 3'-GCTGG-5', 1891, 3'-CTTGG-5', 1927, 3'-ACTGA-5', 1935, 3'-GCAGG-5', 1941, 3'-CCTGG-5', 1959, 3'-GCAGA-5', 1967, 3'-CCTGG-5', 2009, 3'-ACAGA-5', 2017, 3'-GCTGG-5', 2069, 3'-CCAGG-5', 2077, 3'-GCTGA-5', 2109, 3'-ACAGA-5', 2119, 3'-CTTGA-5', 2127, 3'-GCTGA-5', 2226, 3'-GTTGG-5', 2235, 3'-CCTGG-5', 2268, 3'-GCTGG-5', 2326, 3'-GCTGA-5', 2361, 3'-GCAGG-5', 2367, 3'-CTTGA-5', 2379, 3'-CCTGG-5', 2385, 3'-GCAGG-5', 2389, 3'-CCTGG-5', 2435, 3'-ACAGA-5', 2443, 3'-ACAGG-5', 2514, 3'-CCAGG-5', 2519, 3'-GCTGA-5', 2562, 3'-GCAGG-5', 2568, 3'-CTTGA-5', 2580, 3'-GTTGA-5', 2593, 3'-ACAGG-5', 2689, 3'-GCTGA-5', 2696, 3'-GTTGA-5', 2705, 3'-CTTGA-5', 2714, 3'-CCTGG-5', 2720, 3'-GCTGA-5', 2744, 3'-CCTGG-5', 2770, 3'-ACAGA-5', 2778, 3'-ACAGA-5', 2878, 3'-ATAGA-5', 2903, 3'-CTTGG-5', 2921, 3'-GCTGG-5', 3035, 3'-GCTGG-5', 3041, 3'-GCTGA-5', 3085, 3'-CTTGA-5', 3103, 3'-GTTGG-5', 3116, 3'-CCTGG-5', 3128, 3'-GCTGG-5', 3180, 3'-GCTGA-5', 3224, 3'-CTTGA-5', 3242, 3'-CCAGG-5', 3249, 3'-GTAGA-5', 3256, 3'-CCTGG-5', 3298, 3'-ATTGA-5', 3358, 3'-CTTGA-5', 3401, 3'-ATAGA-5', 3422, 3'-GCAGA-5', 3431, 3'-ATAGG-5', 3447, 3'-CTAGA-5', 3463, 3'-CCAGA-5', 3486, 3'-GTTGA-5', 3505, 3'-ATTGG-5', 3529, 3'-GTTGA-5', 3533, 3'-ACTGA-5', 3542, 3'-CCAGG-5', 3564, 3'-CTTGA-5', 3571, 3'-CCAGG-5', 3585, 3'-GCAGA-5', 3589, 3'-GTTGG-5', 3606, 3'-GCTGA-5', 3649, 3'-ACAGA-5', 3672, 3'-GCTGG-5', 3719, 3'-CCTGG-5', 3744, 3'-ACTGG-5', 3749, 3'-CCTGA-5', 3781, 3'-CTTGG-5', 3793, 3'-GTTGA-5', 3805, 3'-GTAGA-5', 3820, 3'-GCTGG-5', 3864, 3'-CCAGG-5', 3871, 3'-CCAGG-5', 3885, 3'-CCTGG-5', 3906, 3'-ACAGA-5', 3917, 3'-CCTGA-5', 3932, 3'-GTTGG-5', 3942, 3'-GTTGG-5', 3946, 3'-CCAGG-5', 3951, 3'-GCTGA-5', 3994, 3'-CTTGA-5', 4012, 3'-CCTGG-5', 4037, 3'-ATAGA-5', 4079, 3'-GTTGG-5', 4097, 3'-CCAGG-5', 4102, 3'-GCTGA-5', 4145, 3'-CCAGG-5', 4170, 3'-CTTGG-5', 4188, 3'-CCAGA-5', 4233, 3'-CCAGG-5', 4253, 3'-CTTGG-5', 4268, 3'-GCTGA-5', 4276, 3'-GCAGG-5', 4282, 3'-CTTGA-5', 4294, 3'-CCTGG-5', 4300, 3'-CCTGG-5', 4349, 3'-CCAGA-5', 4448, 3'-CCTGG-5', 4494, 3'-ACAGA-5', 4518, 3'-CCTGG-5', 4546,
  8. inverse, positive strand, positive direction, is SuccessablesDPEi++.bas, looking for 3'-A/C/G-C/T-A/T-G-A/G-5', 95, 3'-GTAGG-5' at 30, 3'-CCTGG-5' at 37, 3'-ACAGG-5' at 82, 3'-ACAGA-5' at 100, 3'-CCTGG-5' at 187, 3'-CCAGG-5' at 218, 3'-ACAGA-5' at 268, 3'-GTTGG-5' at 608, 3'-GTAGG-5' at 629, 3'-GTAGG-5' at 698, 3'-CCTGA-5' at 746, 3'-CCTGA-5' at 814, 3'-GTTGG-5' at 844, 3'-CTAGG-5' at 865, 3'-ACAGG-5' at 893, 3'-GCTGA-5' at 898, 3'-CCTGA-5' at 914, 3'-GTTGG-5' at 944, 3'-CTAGG-5' at 965, 3'-ACAGG-5' at 993, 3'-GCTGA-5' at 998, 3'-GTTGG-5' at 1280, 3'-GCAGA-5' at 1393, 3'-GCAGA-5' at 1493, 3'-GTTGG-5' at 1616, 3'-ACTGG-5' at 1662, 3'-CCAGA-5' at 1711, 3'-ACAGA-5' at 1731, 3'-CTTGG-5' at 1811, 3'-ACAGA-5' at 1862, 3'-GCAGG-5' at 1930, 3'-CTTGA-5' at 1951, 3'-CCAGA-5' at 1958, 3'-GTTGG-5' at 2013, 3'-ACAGA-5' at 2078, 3'-GTAGA-5' at 2111, 3'-GTTGG-5' at 2120, 3'-ACAGA-5' at 2172, 3'-ACTGG-5' at 2213, 3'-CTTGG-5' at 2225, 3'-CCAGA-5' at 2258, 3'-CCTGA-5' at 2271, 3'-GCTGA-5' at 2359, 3'-CTAGG-5' at 2378, 3'-ACAGA-5' at 2466, 3'-CCAGA-5' at 2489, 3'-CTAGG-5' at 2514, 3'-CTTGG-5' at 2579, 3'-CCTGA-5' at 2672, 3'-CTTGG-5' at 2776, 3'-CCTGA-5' at 2820, 3'-GTAGA-5' at 2852, 3'-ACTGG-5' at 2873, 3'-CCAGA-5' at 2941, 3'-ACTGA-5' at 2945, 3'-ACAGA-5' at 3004, 3'-CCAGA-5' at 3019, 3'-ACTGA-5' at 3029, 3'-ACAGA-5' at 3053, 3'-GTAGG-5' at 3108, 3'-CCTGG-5' at 3172, 3'-GCAGG-5' at 3203, 3'-CCAGA-5' at 3245, 3'-GCAGA-5' at 3256, 3'-GTTGA-5' at 3291, 3'-CCAGA-5' at 3299, 3'-GTAGA-5' at 3329, 3'-ATAGG-5' at 3384, 3'-ACAGA-5' at 3392, 3'-GTAGA-5' at 3403, 3'-CCTGG-5' at 3545, 3'-ACAGG-5' at 3577, 3'-CCAGA-5' at 3608, 3'-ACAGG-5' at 3619, 3'-GTAGG-5' at 3629, 3'-ACTGG-5' at 3714, 3'-ATTGA-5' at 3733, 3'-CCAGA-5' at 3771, 3'-ACTGG-5' at 3784, 3'-GCTGA-5' at 3801, 3'-CTTGG-5' at 3838, 3'-CTTGG-5' at 3856, 3'-GTTGG-5' at 3911, 3'-CTTGG-5' at 3937, 3'-ACTGG-5' at 4018, 3'-CTTGA-5' at 4048, 3'-GCAGA-5' at 4056, 3'-CTAGG-5' at 4081, 3'-GTAGG-5' at 4183, 3'-ACTGG-5' at 4216, 3'-GCTGG-5' at 4358, 3'-ACAGG-5' at 4367, 3'-CCAGA-5' at 4380, 3'-CCAGA-5' at 4414, 3'-CCAGG-5' at 4420.

DREB boxes

There are no dehydration-responsive element-binding (DREB) boxes in either promoter.

E2 boxes

Negative strand in the negative direction there are 5: 3'-ACAGATGT-5', 482, 3'-ACAGATGT-5', 1225, 3'-GCAGTTGG-5', 1514, 3'-ACAGATGT-5', 2989, 3'-ACAGATGT-5', 4213, in the distal promoter.

Positive strand in the negative direction there are 2: 3'-GCAGGTGG-5', 2571, 3'-ACAGATGA-5', 3920.

Inverse complement, negative strand, negative direction there is 1: 3'-CCACCTGT-5', 2117.

Inverse complement, positive strand, negative direction there are 4: 3'-CCACCTGT-5', 394, 3'-ACACCTGT-5', 1131, 3'-GCAACTGC-5', 3851, 3'-ACACCTGT-5', 3970

Negative strand in the positive direction there is 1: 3'-GCAGATGA-5', 37.

EIF4E basal elements

There are no EIF4E basal element, also eIF4E, (4EBE), in either promoter.

Enhancer boxes

Core promoters

Negative strand in the positive direction there are 2: 3'-CACATG-5', 4364, 3'-CACATG-5', 4370.

Proximal promoters

Positive strand, negative direction there is 1: 3'-CACATG-5' at 4247.

Negative strand, positive direction there are 2: 3'-CACATG-5', 4153, 3'-CACATG-5', 4221.

Distal promoters

Negative strand in the negative direction there are 4: 3'-CACATG-5' at 324, 3'-CACATG-5' at 797, 3'-CACATG-5' at 2213, and 3'-CACATG-5' at 2342.

Positive strand in the negative direction there are 17, 3'-CACATG-5' at 123, 3'-CACATG-5' at 200, 3'-CACATG-5' at 952, 3'-CACATG-5' at 1206, 3'-CACATG-5' at 1849, 3'-CACATG-5' at 1952, 3'-CACATG-5' at 2151, 3'-CACATG-5' at 2276, 3'-CACATG-5' at 2322, 3'-CACATG-5' at 2533, 3'-CACATG-5' at 2613, 3'-CACATG-5' at 2667, 3'-CACATG-5' at 2751, 3'-CACATG-5' at 2783, 3'-CACATG-5' at 4106, 3'-CACATG-5' at 4116.

Negative strand in the positive direction there are 17: 3'-CACATG-5', 1186, 3'-CACATG-5', 1238, 3'-CACATG-5', 1871, 3'-CACATG-5', 1933, 3'-CACATG-5', 2031, 3'-CACATG-5', 2140, 3'-CACATG-5', 2153, 3'-CACATG-5', 2266, 3'-CACATG-5', 2473, 3'-CACATG-5', 3140, 3'-CACATG-5', 3335, 3'-CACATG-5', 3580, 3'-CACATG-5', 3707, 3'-CACATG-5', 3742, 3'-CACATG-5', 3827, 3'-CACATG-5', 3900, 3'-CACATG-5', 3956.

Positive strand in the positive direction there are 4: 3'-CACATG-5', 126, 3'-CACATG-5', 565, 3'-CACATG-5', 2596, 3'-CACATG-5', 3114.

F boxes

GAAC elements

  1. negative strand in the negative direction, looking for 3'-GAACT-5', 13, 3'-GAACT-5', 843, 3'-GAACT-5', 1009, 3'-GAACT-5', 1300, 3'-GAACT-5', 2127, 3'-GAACT-5', 2379, 3'-GAACT-5', 2580, 3'-GAACT-5', 2714, 3'-GAACT-5', 3103, 3'-GAACT-5', 3242, 3'-GAACT-5', 3401, 3'-GAACT-5', 3571, 3'-GAACT-5', 4012, 3'-GAACT-5', 4294,
  2. negative strand in the positive direction, looking for 3'-GAACT-5', 1, 3'-GAACT-5', 609,
  3. positive strand in the negative direction, looking for 3'-GAACT-5', 2, 3'-GAACT-5', 1685, 3'-GAACT-5', 3460,
  4. positive strand in the positive direction, looking for 3'-GAACT-5', 2, 3'-GAACT-5', 577, 3'-GAACT-5', 692,
  5. complement, negative strand, negative direction, looking for 3'-CTTGA-5', 2, 3'-CTTGA-5', 1685, 3'-CTTGA-5', 3460,
  6. complement, negative strand, positive direction, looking for 3'-CTTGA-5', 2, 3'-CTTGA-5', 577, 3'-CTTGA-5', 692,
  7. complement, positive strand, negative direction, looking for 3'-CTTGA-5', 13, 3'-CTTGA-5', 843, 3'-CTTGA-5', 1009, 3'-CTTGA-5', 1300, 3'-CTTGA-5', 2127, 3'-CTTGA-5', 2379, 3'-CTTGA-5', 2580, 3'-CTTGA-5', 2714, 3'-CTTGA-5', 3103, 3'-CTTGA-5', 3242, 3'-CTTGA-5', 3401, 3'-CTTGA-5', 3571, 3'-CTTGA-5', 4012, 3'-CTTGA-5', 4294,
  8. complement, positive strand, positive direction, looking for 3'-CTTGA-5', 1, 3'-CTTGA-5', 609,
  9. inverse complement, negative strand, negative direction, looking for 3'-AGTTC-5', 3, 3'-AGTTC-5', 3844, 3'-AGTTC-5', 4027, 3'-AGTTC-5', 4178,
  10. inverse complement, negative strand, positive direction, looking for 3'-AGTTC-5', 1, 3'-AGTTC-5', 761,
  11. inverse complement, positive strand, negative direction, looking for 3'-AGTTC-5', 6, 3'-AGTTC-5', 253, 3'-AGTTC-5', 719, 3'-AGTTC-5', 1177, 3'-AGTTC-5', 4024, 3'-AGTTC-5', 4175, 3'-AGTTC-5', 4417,
  12. inverse complement, positive strand, positive direction, looking for 3'-AGTTC-5', 0,
  13. inverse, negative strand, negative direction, looking for 3'-TCAAG-5', 6, 3'-TCAAG-5', 253, 3'-TCAAG-5', 719, 3'-TCAAG-5', 1177, 3'-TCAAG-5', 4024, 3'-TCAAG-5', 4175, 3'-TCAAG-5', 4417,
  14. inverse, negative strand, positive direction, looking for 3'-TCAAG-5', 0,
  15. inverse, positive strand, negative direction, looking for 3'-TCAAG-5', 3, 3'-TCAAG-5', 3844, 3'-TCAAG-5', 4027, 3'-TCAAG-5', 4178,
  16. inverse, positive strand, positive direction, looking for 3'-TCAAG-5', 1, 3'-TCAAG-5', 761.

GA responsive elements

Only one GARE (an inverse) occurs: between ZSCAN22 and A1BG 3'-AAACAAT-5' at 230 nts and its complement.

GATA boxes

Proximal promoters

Inverse complement, negative strand, positive direction there is 1: 3'-TTTATCAC-5', 4125.

Distal promoters

Positive strand in the negative direction there are 2: 3'-GGGATAGA-5', 100, 3'-ATGATAGA-5', 355.

Inverse complement, negative strand, negative direction there is 1: 3'-GTTATCAT-5', 2500.

Inverse complement, positive strand, negative direction there is 1: 3'-TTTATCTT-5', 1732.

Inverse complement, negative strand, positive direction there is 1: 3'-GTTATCCC-5', 3385.

Inverse complement, positive strand, positive direction there are 2: 3'-GCTATCAG-5', 1840, 3'-TTTATCTT-5', 2628.

G boxes

There are no G boxes in either promoter.

GC boxes

Positive strand in the negative direction there are 2; 3'-TGGGCGTGGT-5', 1898, 3'-TGGGCGTGGT-5', 3048, in the distal promoter.

Inverse complement, negative strand, negative direction there is 1: 3'-ACTCCGCCCA-5', 3092.

Inverse complement, positive strand, negative direction there is 1: 3'-GCTCCGCCTC-5', 1505.

Negative strand in the positive direction there is 1: 3'-TGGGCGGGAC-5', 409.

Inverse complement, positive strand, positive direction there is 1:, 3'-GCCACGCCCC-5', 491.

GCC boxes

The GCC box is the same as the AGC box.

GLM boxes

There are no GCN4-like motif (GLM) boxes in either promoter.

Grainy head transcription factor binding sites

H boxes

Core promoters

Between ZSCAN22 and A1BG: There is one inverse and its complement 3'-AGGAGA-5' at 4428 nts.

Between ZNF497 and A1BG: There is an inverse and its complement 3'-AGGACA-5' at 4252. There is five after the TSS: 3'-AGAGAA-5' at 4387, 3'-AGTACA-5' at 4365, 3'-ACCAGA-5' at 4380, 3'-AAGAGA-5' at 4386, 3'-ACGACA-5' at 4392 and their complements.

Proximal promoters

Between ZSCAN22 and A1BG: There is one H box (3'-ANANNA-5'): negative direction, negative strand, 3'-ACACGA-5' at 4402. On the positive strand in the negative direction there are 16: 3'-ACAAAA-5' at 4216, 3'-AAAAAA-5' at 4218, 3'-AAAATA-5' at 4220, 3'-AAATAA-5' at 4221, 3'-ATAATA-5' at 4223, 3'-AAAAAA-5' at 4378, 3'-AAAAGA-5' at 4380, 3'-AAAGAA-5' at 4381, 3'-AGAAAA-5' at 4383, 3'-AAAAAA-5'at 4385, 3'-AAAAGA-5' at 4387, 3'-AAAGAA-5' at 4388, 3'-AGAAAA-5' at 4390, 3'-AAAAGA-5' at 4392, 3'-AAAGAA-5' at 4393, and 3'-AGAAAA-5' at 4395, with their complements on the negative strand, negative direction.

Between ZNF497 and A1BG: There is one H box (3'-ANANNA-5'): 3'-AGAGAA-5' at 4387 in the proximal promoter, negative strand, positive direction. There are four: 3'-TCATGT-5' at 4365, 3'-TGGTCT-5' at 4380, 3'-TTCTCT-5' at 4386, and 3'-TGCTGT-5' at 4392 and their complements in the positive direction.

Distal promoters

Between ZSCAN22 and A1BG, negative strand, negative direction: 3'-AGAGGA-5' at 3387, 3'-AGAGGA-5' at 3638, and 3'-AGAGGA-5' at 3675. One inverse and its complement 3'-AGGAGA-5' at 3790. There are 14 H boxes: 3'-ACACCA-5' at 788, 3'-ACATCA-5' at 2541, 3'-ACACCA-5' at 2659, 3'-ACATTA-5' at 2675, 3'-ATAAAA-5' at 2853, 3'-AAAGTA-5' at 2886, 3'-ACATTA-5' at 3064, 3'-AGATGA-5' at 3159, 3'-ACACCA-5' at 3187, 3'-AGAAGA-5' at 3554, 3'-AGACGA-5' at 3707, 3'-ACACCA-5' at 3811, 3'-ACATTA-5' at 3973, and 3'-ACATCA-5' at 4124.

On the positive strand, negative direction, there are 127 H boxes: 3'-ACCACA-5' at 608, 3'-ACCACA-5' at 793, 3'-ACACCA-5' at 883, 3'-ACCACA-5', 1477, 3'-ACACCA-5' at 2419, 3'-AAAAAA-5' at 2461, 3'-AAAAAA-5' at 2462, 3'-AAAAAA-5' at 2463, 3'-AAAAAA-5' at 2464, 3'-AAAAAA-5' at 2465, 3'-AAAAAA-5' at 2466, 3'-AAAAAA-5' at 2467, 3'-AAAAAA-5' at 2468, 3'-AAAAAA-5' at 2469, 3'-AAAAAA-5' at 2470, 3'-AAAGCA-5' at 2473, 3'-AAAGCA-5' at 2479, 3'-AAACAA-5' at 2484, 3'-AAACAA-5' at 2488, 3'-ACAAAA-5' at 2490, 3'-ATAGTA-5' at 2500, 3'-AGAAAA-5' at 2506, 3'-AAAACA-5' at 2508, 3'-AAACAA-5' at 2509, 3'-AGACCA-5' at 2599, 3'-ATACAA-5' at 2642, 3'-ACAAAA-5' at 2644, 3'-AAATCA-5' at 2648, 3'-ACAGGA-5' at 2690, 3'-AAATCA-5' at 2749, 3'-AGAGCA-5' at 2781, 3'-AAAAGA-5' at 2798, 3'-AAAGAA-5' at 2799, 3'-AAAGAA-5' at 2803, 3'-AGAAAA-5' at 2805, 3'-AAAAGA-5' at 2807, 3'-AGAGAA-5' at 2810, 3'-AGAAGA-5' at 2812, 3'-AGAAAA-5' at 2815, 3'-AAAAAA-5' at 2817, 3'-AAAAGA-5' at 2819, 3'-AAAGAA-5' at 2820, 3'-AGAAAA-5' at 2822, 3'-AAAAGA-5' at 2824, 3'-AGAGAA-5' at 2827, 3'-AGAAGA-5' at 2829, 3'-AGAAAA-5' at 2832, 3'-AAAAAA-5' at 2834, 3'-AAAAGA-5' at 2836, 3'-AAAGAA-5' at 2837, 3'-AGAAAA-5' at 2839, 3'-AAAACA-5' at 2841, 3'-AAACAA-5' at 2842, 3'-AAAATA-5' at 2868, 3'-ATATAA-5' at 2873, 3'-AAAAAA-5' at 2929, 3'-ACATCA-5' at 2941, 3'-ACATTA-5' at 2951, 3'-AAACCA-5' at 2971, 3'-AAAATA-5' at 3012, 3'-AAATAA-5' at 3013, 3'-AAAAAA-5' at 3026, 3'-AAACTA-5' at 3029, 3'-AGACCA-5' at 3122, 3'-AAAACA-5' at 3166, 3'-ACATAA-5' at 3169, 3'-ATAAAA-5' at 3171, 3'-AAATTA-5' at 3175, 3'-AGATCA-5' at 3277, 3'-ACAAGA-5' at 3307, 3'-AGAGCA-5' at 3310, 3'-AAAACA-5' at 3329, 3'-AAACAA-5' at 3330, 3'-AAATAA-5' at 3334, 3'-AAACAA-5' at 3338, 3'-ACAAGA-5' at 3340, 3'-AGAAAA-5' at 3343, 3'-AAACCA-5' at 3365, 3'-AGAGGA-5' at 3387, 3'-ACATCA-5' at 3394, 3'-AGAGAA-5' at 3406, 3'-ACATCA-5' at 3415, 3'-ACATTA-5' at 3436, 3'-ATATTA-5' at 3454, 3'-ATATTA-5' at 3468, 3'-AAACCA-5' at 3484, 3'-AGATCA-5' at 3489, 3'-AAAACA-5' at 3511, 3'-ACACAA-5' at 3514, 3'-ATAATA-5' at 3538, 3'-ACAAGA-5' at 3635, 3'-AGAGGA-5' at 3638, 3'-AAAGAA-5' at 3666, 3'-AGAACA-5' at 3668, 3'-AGAGGA-5' at 3675, 3'-ACAAGA-5' at 3759, 3'-AGACCA-5' at 3762, 3'-ACCACA-5' at 3764, 3'-ACAAAA-5' at 3767, 3'-AGAGCA-5' at 3913, 3'-AGATGA-5' at 3920, 3'-AGACCA-5' at 4031, 3'-ACAAAA-5' at 4066, 3'-AAAAAA-5' at 4068, 3'-AAAATA-5' at 4070, 3'-AAATAA-5' at 4071, 3'-AAATAA-5' at 4075, 3'-ATAATA-5' at 4077, 3'-ATAGAA-5' at 4080, 3'-AAAGAA-5' at 4084, 3'-AGAAAA-5' at 4086, 3'-AGACAA-5' at 4182, 3'-ACAAAA-5' at 4216, 3'-AAAAAA-5' at 4218, 3'-AAAATA-5' at 4220, 3'-AAATAA-5' at 4221, 3'-ATAATA-5' at 4223, 3'-AAAAAA-5' at 4378, 3'-AAAAGA-5' at 4380, 3'-AAAGAA-5' at 4381, 3'-AGAAAA-5' at 4383, 3'-AAAAAA-5' at 4385, 3'-AAAAGA-5' at 4387, 3'-AAAGAA-5' at 4388, 3'-AGAAAA-5' at 4390, 3'-AAAAGA-5' at 4392, 3'-AAAGAA-5' at 4393, and 3'-AGAAAA-5' at 4395.

Between ZNF497 and A1BG: There are two H boxes after nucleotide number 2300 in the negative strand and positive direction: 3'-ACCACA-5' at 420, 3'-ACACCA-5' at 386, 3'-TGGTGT-5' at 511, 3'-TGGTGT-5' at 530, 3'-ACACCA-5' at 2603 and 3'-ACACCA-5' at 3825.

There are two H boxes after nucleotide number 2300 in the positive strand and positive direction: 3'-ACACCA-5' at 204, 3'-ACACCA-5' at 528, 3'-ACACCA-5' at 3643 and 3'-ACACCA-5' at 3967.

Regarding 3'-ANANNA-5', on the negative strand, positive direction, there are 25 H boxes: 3'-ATACCA-5' at 2591, 3'-ACACCA-5' at 2603, 3'-ATAGAA-5' at 2628, 3'-AAACCA-5' at 2632, 3'-ACACTA-5'at 2637, 3'-ATATAA-5' at 2662, 3'-AGAGCA-5' at 2704, 3'-AGAGGA-5' at 2793, 3'-AAAGGA-5' at 2829, 3'-ACAGAA-5' at 2838, 3'-AAAGAA-5' at 3066, 3'-AGAACA-5' at 3094, 3'-AGAGCA-5' at 3138, 3'-ACAGCA-5' at 3212, 3'-ACAGTA-5' at 3414, 3'-AGATGA-5' at 3476, 3'-ACAGGA-5' at 3572, 3'-AAAGCA-5' at 3599, 3'-ACATGA-5' at 3708, 3'-ACACCA-5' at 3825, 3'-AAAAGA-5' at 3929, 3'-AGAACA-5' at 4068, 3'-AAATGA-5' at 4094, 3'-ACATCA-5' at 4116, and 3'-ACATGA-5' at 4154.

On the positive strand, positive direction there are 20 H boxes: 3'-AAATAA-5' at 2347, 3'-AAAAAA-5' at 2451, 3'-AAAACA-5' at 2453, 3'-AGACGA-5' at 2976, 3'-AGACCA-5' at 3022, 3'-AGAGAA-5' at 3056, 3'-AGAAGA-5' at 3058, 3'-AGAGGA-5' at 3302, 3'-AGACGA-5' at 3307, 3'-ACAGAA-5' at 3393, 3'-AGAAGA-5' at 3395, 3'-ACAGGA-5' at 3620, 3'-ACACCA-5' at 3643, 3'-AAACCA-5' at 3948, 3'-ACACCA-5' at 3967, 3'-AGAGGA-5' at 4059, 3'-AAAATA-5' at 4122, 3'-AAATCA-5' at 4137, 3'-AAATAA-5' at 4142, and 3'-ATATTA-5' at 4168.

There inverses on the negative strand in the positive direction of 31 H boxes: 3'-ATGACA-5' at 2412, 3'-ACTACA-5' at 2428, 3'-AGGACA-5' at 2460, 3'-ATTATA-5' at 2548, 3'-ACCACA-5' at 2600, 3'-AGGAAA-5' at 2623, 3'-AATAGA-5' at 2627, 3'-ACCACA-5' at 2634, 3'-AACAGA-5' at 2652, 3'-AGCAAA-5' at 2706, 3'-AGGAAA-5' at 2831, 3'-AACACA-5' at 2835, 3'-ATGACA-5' at 2843, 3'-AGAACA-5' at 3094, 3'-AACACA-5' at 3096, 3'-AGGACA-5' at 3131, 3'-ACCAAA-5' at 3175, 3'-AACAGA-5' at 3179, 3'-AGCAGA-5' at 3214, 3'-AGTAGA-5' at 3416, 3'-AATAAA-5' at 3427, 3'-ACCAGA-5' at 3548, 3'-ATGACA-5' at 3569, 3'-AGGAGA-5' at 3650, 3'-AGCACA-5' at 3740, 3'-ACCACA-5' at 3859, 3'-AAAAGA-5' at 3929, 3'-AGAACA-5' at 4068, 3'-ATCATA-5' at 4149, and 3'-ATTATA-5' at 4166.

HMG boxes

HNF6s

Core promoters

Inverse complement, positive strand, negative direction there is 1: 3'-TTATTAATTC-5', 4542.

Proximal promoters

Negative strand in the negative direction there is 1: 3'-TTATTAATCG-5', 4229.

Negative strand in the positive direction there are 2: 3'-TTATTAATCA-5', 4147, 3'-TTATTGATTA-5', 4164.

Inverse complement, positive strand, positive direction there are 1: 3'-ATATTAACAA-5', 4172.

Distal promoters

Negative strand in the negative direction there are 2: 3'-GTGTTAATAA-5', 1725, 3'-TAGTTGATAA-5', 3527.

Positive strand in the negative direction there is 1: 3'-AAATTGATAA-5', 3361.

Inverse complement, negative strand, negative direction there are 2: 3'-ACATGGACAT-5', 802, 3'-TAATGAACTT-5', 1301.

Inverse complement, positive strand, negative direction there are 2: 3'-AAATTGATAA-5', 3361, 3'-TCATCAACTA-5', 3525.

Negative strand in the positive direction there are 1: 3'-ATGTCCATGG-5', 3581.

Positive strand in the positive direction there is 1: 3'-GAGTCCATTG-5', 3732.

Inverse complement, positive strand, positive direction there is 1: 3'-CCATTGACTC-5', 3736.

HY boxes

Core promoters

Positive strand in the negative direction there is 1: 3'-TGAGGG-5' at 4558.

Inverse complement, negative strand, negative direction there is 1: 3'-CCCTCA-5', 4498.

Negative strand in the positive direction there is 1: 3'-TGTGGG-5', 4395.

Distal promoters

Negative strand in the negative direction there is 1: 3'-TGTGGG-5' at 749.

Positive strand in the negative direction there are 4: 3'-TGAGGG-5' at 88, 3'-TGAGGG-5' at 2699, 3'-TGAGGG-5' at 3652, 3'-TGTGGG-5' at 3712.

Inverse complement, negative strand, negative direction there are 3: 3'-CCCTCA-5', 2702, 3'-CCCACA-5', 3184, 3'-CCCTCA-5', 3889.

Positive strand in the positive direction there are 2: 3'-TGTGGG-5', 2965, 3'-TGTGGG-5', 3533.

Negative strand in the positive direction there are 3: 3'-TGAGGG-5', 258, 3'-TGAGGG-5', 3479, 3'-TGAGGG-5', 3879.

Inverse complement, negative strand, positive direction there are 3: 3'-CCCTCA-5', 88, 3'-CCCTCA-5', 3207, 3'-CCCTCA-5', 3503.

Inverse complement, positive strand, positive direction there is 5: 3'-CCCTCA-5', 494, 3'-CCCTCA-5', 662, 3'-CCCTCA-5', 1783, 3'-CCCACA-5', 1803, 3'-CCCTCA-5', 3185.

I boxes

Initiator elements (YYANWYY)

Core promoters

There is the following Inr in the core promoter, negative strand, negative direction: 3'-TTACTCC-5' at 4557.

There are four Inrs in the core promoter, positive strand, negative direction: 3'-CCACTCC-5' at 4425, 3'-CCACTTT-5' at 4461, 3'-TCACATT-5' at 4533, and 3'-TTAATTC-5' at 4542.

There is the following Inr in the core promoter, negative strand, positive direction: 3'-CTGCACC-5' at 4343.

There are two Inrs in the core promoter, positive strand, positive direction: 3'-CCACTCC-5' at 4401 and 3'-CCAGACC-5' at 4416.

Proximal promoters

There are eight Inrs on the negative strand in the negative direction: 3'-TCACTCT-5' at 4202, 3'-TCGGTCT-5' at 4233, 3'-CTGCACC-5' at 4238, 3'-TCGGACC-5' at 4300, 3'-CCAGTTT-5' at 4309, 3'-TCGGACC-5' at 4349, 3'-TCACACT-5' at 4361, and 3'-TTACTCC-5' at 4557.

There are seven Inrs on the positive strand in the negative direction: 3'-CCGGACT-5' at 4327, 3'-CTGCACT-5' at 4340, 3'-CCAGTTC-5' at 4417, 3'-CCACTCC-5' at 4425, 3'-CCACTTT-5' at 4461, 3'-TCACATT-5' at 4533, and 3'-TTAATTC-5' at 4542.

There is one Inr on the negative strand in the positive direction: 3'-CTGCACC-5' at 4343.

There is two Inrs on the positive strand in the positive direction: 3'-CCACTCC-5' at 4401 and 3'-CCAGACC-5' at 4416.

Distal promoters

Negative strand in the negative direction there are 87: 3'-TTGTTCC-5', 71, 3'-CTATACC-5', 77, 3'-CCGTTTC-5', 93, 3'-CCGTACT-5', 124, 3'-CCATATT-5', 181, 3'-CTACATT-5', 247, 3'-TTGGTCC-5', 262, 3'-TTATACT-5', 274, 3'-TCACTCT-5', 301, 3'-CTGCTTT-5', 312, 3'-CCGGTTC-5', 419, 3'-CCAGTCC-5', 441, 3'-TCGGACC-5', 459, 3'-TTGTATC-5', 468, 3'-TCACTTT-5', 473, 3'-TCGGACC-5', 508, 3'-CCGGTTC-5', 556, 3'-CCAGTCC-5', 578, 3'-TTATACC-5', 605, 3'-CCGGTCC-5', 648, 3'-CCGGTTC-5', 692, 3'-CCAGTCC-5', 714, 3'-TCGGACT-5', 732, 3'-TCGCACC-5', 741, 3'-CTACACC-5', 787, 3'-TCGGTTC-5', 874, 3'-TCGGACC-5', 899, 3'-TCGCTCT-5', 913, 3'-TCGGTCC-5', 948, 3'-CCGTACC-5', 953, 3'-TTAGTCC-5', 984, 3'-TTGGACC-5', 1015, 3'-TCACTCT-5', 1079, 3'-TCGGACC-5', 1198, 3'-TTGTACC-5', 1207, 3'-CCACTTT-5', 1212, 3'-CCGCACC-5', 1244, 3'-TTGGATC-5', 1306, 3'-TCAGACC-5', 1356, 3'-TTATTCT-5', 1365, 3'-TCGTTTT-5', 1371, 3'-TTGTTTT-5', 1394, 3'-CCACACT-5', 1479, 3'-TTGCTTC-5', 1555, 3'-CCGTTTT-5', 1561, 3'-TTACTTT-5', 1582, 3'-TTGGATT-5', 1591, 3'-TTAATTT-5', 1697, 3'-TTATACC-5', 1742, 3'-CCGCACC-5', 1897, 3'-CCGTACT-5', 1953, 3'-TTGGACC-5', 1959, 3'-TCGGACC-5', 2009, 3'-TCGTTCT-5', 2023, 3'-TTACACC-5', 2065, 3'-CCGGTCC-5', 2077, 3'-TCACATT-5', 2087, 3'-TCAAACT-5', 2141, 3'-TTGTACC-5', 2152, 3'-CCGCTTT-5', 2157, 3'-CCAGTCC-5', 2250, 3'-TCAAACT-5', 2257, 3'-TCGGACC-5', 2268, 3'-TCGTACC-5', 2277, 3'-CCACTTT-5', 2282, 3'-TTGGACC-5', 2385, 3'-TCGGACC-5', 2435, 3'-TCACTCT-5', 2449, 3'-TCGTTTT-5', 2476, 3'-TTGTTTT-5', 2490, 3'-TCATTCT-5', 2503, 3'-CCGGTCC-5', 2519, 3'-CCAGTCC-5', 2587, 3'-TCACACC-5', 2605, 3'-TTGTACC-5', 2614, 3'-CCACTTT-5', 2619, 3'-TCACACC-5', 2658, 3'-TTGGACC-5', 2720, 3'-TCGGACC-5', 2770, 3'-TCGTACT-5', 2784, 3'-TTGATTC-5', 2914, 3'-CCGATTT-5', 3009, 3'-TTGATTC-5', 3031, 3'-CCGCACC-5', 3047, 3'-TCGGACC-5', 3128, 3'-TTGTTCC-5', 3141, 3'-CCACTTT-5', 3146, 3'-TTGTATT-5', 3169, 3'-CCACACC-5', 3186, 3'-TCGGTTC-5', 3273, 3'-TCGGACC-5', 3298, 3'-TTGTTCT-5', 3307, 3'-TCGTTTT-5', 3313, 3'-TTGTTCT-5', 3340, 3'-TCGTTCT-5', 3374, 3'-CCGAACT-5', 3401, 3'-CCGTATC-5', 3446, 3'-TTGATCT-5', 3463, 3'-TTGGTCT-5', 3486, 3'-CTGTTCT-5', 3759, 3'-CTACACC-5', 3810, 3'-CTGGTCC-5', 3871, 3'-TCATTCT-5', 3893, 3'-CTACTTT-5', 3922, 3'-CCGGTCC-5', 3951, 3'-TCGGACC-5', 4037, 3'-TTGTATC-5', 4046, 3'-TCACTCT-5', 4051, 3'-TTACACT-5', 4092, 3'-CCGGTCC-5', 4102, 3'-CCGTACC-5', 4107, 3'-CCGGTCC-5', 4170, 3'-TCGAACC-5', 4188.

Positive strand in the negative direction there are 40: 3'-CTGAATT-5', 20, 3'-TTGGACC-5', 32, 3'-CTGCATT-5', 152, 3'-TTGAACC-5', 846, 3'-TCACACC-5', 882, 3'-TTGAACC-5', 1012, 3'-TCACTCC-5', 1058, 3'-TCACACC-5', 1128, 3'-TTGAACC-5', 1303, 3'-TTGCACC-5', 1339, 3'-TTGCACT-5', 1347, 3'-CCAGTCT-5', 1354, 3'-CCATTTC-5', 1380, 3'-TCGCTCT-5', 1450, 3'-CTATATC-5', 1528, 3'-TTATTTT-5', 1727, 3'-CTGCACT-5', 2000, 3'-CTACTCC-5', 2352, 3'-TTGAACC-5', 2382, 3'-TCACACC-5', 2418, 3'-CTGCACT-5', 2426, 3'-TTGAATC-5', 2708, 3'-TTGAACC-5', 2717, 3'-CTGCACC-5', 2761, 3'-TTGAACC-5', 3245, 3'-TTGCACT-5', 3289, 3'-CCAGATC-5', 3488, 3'-CTGCTCC-5', 3582, 3'-CCATTTC-5', 3688, 3'-CTGGACT-5', 3747, 3'-CTGAACC-5', 3784, 3'-CCATACC-5', 3858, 3'-TCACACC-5', 3967.

Inverse complement, negative strand, negative direction there are 32: 3'-GATACAA-5', 213, 3'-GGACCGA-5', 598, 3'-AGTGCGG-5', 664, 3'-GGACTGG-5', 734, 3'-AGTGTGG-5', 882, 3'-GAAGTGA-5', 1056, 3'-AGTGTGG-5', 1128, 3'-GGACCGG-5', 1200, 3'-AGAGCGA-5', 1448, 3'-GGTCCGA-5', 1462, 3'-GATATAG-5', 1528, 3'-AGAACGG-5', 1608, 3'-AAAATAG-5', 1730, 3'-AGTGCAG-5', 1773, 3'-GGACCGA-5', 1843, 3'-AGTGCGG-5', 1992, 3'-AGTGCGG-5', 2208, 3'-AGTGTGG-5', 2418, 3'-AGTACGG-5', 2535, 3'-AGTACGG-5', 2753, 3'-AAAGTAG-5', 2887, 3'-GATTCGA-5', 3033, 3'-GGACCGG-5', 3130, 3'-AGTGCGG-5', 3281, 3'-AGTCCGA-5', 3398, 3'-GGTCTAG-5', 3488, 3'-GGTATGG-5', 3858, 3'-GGTCCGG-5', 3873, 3'-AGTGTGG-5', 3967.

Negative strand in the positive direction there are 45: 3'-TTGTATT-5', 115, 3'-CTGTTTT-5', 147, 3'-CCACACT-5', 345, 3'-CCGGACT-5', 746, 3'-CTGCACT-5', 1372, 3'-CTGCACT-5', 1472, 3'-CCAGACT-5', 1744, 3'-CCACTTC-5', 1914, 3'-CTATTTC-5', 1978, 3'-CCAGTCC-5', 2026, 3'-TCGCTTC-5', 2095, 3'-TCATATT-5', 2178, 3'-CTGCATT-5', 2206, 3'-CCAGATC-5', 2230, 3'-TCAATCT-5', 2235, 3'-CTGTTTC-5', 2263, 3'-TCACTCT-5', 2306, 3'-CTACACC-5', 2430, 3'-CTAATTT-5', 2440, 3'-CCGCACC-5', 2566, 3'-TTATACC-5', 2590, 3'-CCACACC-5', 2602, 3'-CCACACT-5', 2636, 3'-TCAGATT-5', 2868, 3'-CTGCTCC-5', 2978, 3'-CCAGTCC-5', 2998, 3'-CCAGTCC-5', 3084, 3'-CTGGTCT-5', 3245, 3'-TCGCTCT-5', 3276, 3'-CTGGTCT-5', 3299, 3'-CTGCTCC-5', 3309, 3'-CTGCACC-5', 3322, 3'-CCGCATC-5', 3328, 3'-TTGCACT-5', 3343, 3'-CTGTTCC-5', 3352, 3'-TTGCATC-5', 3402, 3'-TCACACT-5', 3507, 3'-CCAGACC-5', 3550, 3'-CTGTTCC-5', 3625, 3'-TCACACC-5', 3824, 3'-TCATTTT-5', 4120, 3'-TCACTCT-5', 4128, 3'-TTGATTT-5', 4134, 3'-TTAGTTT-5', 4139.

Positive strand in the positive direction there are 75: 3'-CTGGACC-5', 40, 3'-CCGGTCC-5', 215, 3'-TTACACT-5', 230, 3'-CCGGACC-5', 286, 3'-CCGTTCC-5', 503, 3'-TCGGTCC-5', 515, 3'-CCGCTCT-5', 557, 3'-CCGTTCC-5', 587, 3'-CCGCTCT-5', 641, 3'-CCGTTCC-5', 671, 3'-CCGGACT-5', 725, 3'-CCGTTCC-5', 823, 3'-TCGGTCT-5', 835, 3'-TTGGACC-5', 847, 3'-CCGTTCC-5', 923, 3'-TCGGTCT-5', 935, 3'-TTGGACC-5', 947, 3'-CCGTTCC-5', 1007, 3'-TCGCTCT-5', 1061, 3'-CCGGTCC-5', 1175, 3'-CCGCTCT-5', 1229, 3'-CCGTTCC-5', 1259, 3'-CCGTTCC-5', 1327, 3'-CCGCTCT-5', 1381, 3'-CCGTTCC-5', 1427, 3'-CCGCTCT-5', 1481, 3'-TCGTTCC-5', 1511, 3'-CCGCTCT-5', 1565, 3'-CCGCACT-5', 1720, 3'-CCACACC-5', 1805, 3'-CCGCTCT-5', 1921, 3'-CCGTTCT-5', 1948, 3'-CCACACC-5', 1971, 3'-TCAATTT-5', 2136, 3'-TTGTACT-5', 2141, 3'-CTACTTT-5', 2146, 3'-CCGTTCT-5', 2190, 3'-CCAGTCT-5', 2222, 3'-TTGGTCT-5', 2228, 3'-CCGCACT-5', 2555, 3'-CCGGTCC-5', 2574, 3'-TCAGTCT-5', 2609, 3'-TCAGTTC-5', 2615, 3'-TCAGTCC-5', 2620, 3'-CTATATT-5', 2662, 3'-TCAATCC-5', 2668, 3'-TCGTTTT-5', 2707, 3'-TCGATTC-5', 2789, 3'-TTGCTCC-5', 2806, 3'-CTAAACT-5', 2871, 3'-CTGGTCC-5', 2876, 3'-CCAGACT-5', 2943, 3'-CCGGACC-5', 2988, 3'-CCAGACC-5', 3021, 3'-TTATACC-5', 3162, 3'-CTGGTTT-5', 3175, 3'-TCGGTCT-5', 3221, 3'-CTACTCC-5', 3478, 3'-CCGATCC-5', 3484, 3'-TCGATCC-5', 3522, 3'-CTGGTCT-5', 3548, 3'-TCACACT-5', 3594, 3'-CCACTCC-5', 3647, 3'-CCGGACC-5', 3679, 3'-CCGGACC-5', 3758, 3'-CTGGACC-5', 3787, 3'-TCACTCC-5', 3878, 3'-TCAGACT-5', 3924, 3'-TCACACC-5', 3966, 3'-CCACACT-5', 3971, 3'-TTACTCC-5', 4096, 3'-CTACTCC-5', 4102, 3'-CTAAATC-5', 4136, 3'-CCACTCC-5'.

Inverse complement, negative strand, positive direction there are 61: 3'-AGAGTGG-5', 53, 3'-AATGTGA-5', 230, 3'-GGAGCGA-5', 429, 3'-AGACCGG-5', 442, 3'-GGTGCGG-5', 489, 3'-AGTGCGG-5', 498, 3'-AGTGCGG-5', 582, 3'-AGTGCGG-5', 666, 3'-GGTGCAG-5', 784, 3'-AGTGCGG-5', 1086, 3'-AGTGCGG-5', 1170, 3'-AGTGCGG-5', 1254, 3'-AATGCGG-5', 1322, 3'-AATGCGG-5', 1422, 3'-AGTGCGG-5', 1590, 3'-GAAGCGG-5', 1636, 3'-GGTGCGG-5', 1764, 3'-AGTGCAG-5', 1787, 3'-GGTGTGG-5', 1805, 3'-GAACTGG-5', 1953, 3'-GGTGTGG-5', 1971, 3'-AAAGCAG-5', 2007, 3'-AGTGCAG-5', 2064, 3'-GAACCAG-5', 2227, 3'-AGATCAA-5', 2232, 3'-AGTGCAG-5', 2327, 3'-GGTGCAA-5', 2335, 3'-GAAATAG-5', 2626, 3'-GATATAA-5', 2662, 3'-GGACTGA-5', 2674, 3'-AGAGCAA-5', 2705, 3'-AAAGTGG-5', 2711, 3'-GGTGCAA-5', 2801, 3'-AGAATGA-5', 2841, 3'-GATTTGA-5', 2871, 3'-GGTCTGA-5', 2943, 3'-GGTCTGG-5', 3021, 3'-AATATGG-5', 3162, 3'-GAAATGG-5', 3168, 3'-GGACCAA-5', 3174, 3'-GGAATGA-5', 3441, 3'-GATGCAG-5', 3460, 3'-AGTGCAG-5', 3465, 3'-GGACCAG-5', 3547, 3'-GGAATGA-5', 3567, 3'-AGTGTGA-5', 3594, 3'-GAAGCGG-5', 3670, 3'-AATCCGA-5', 3799, 3'-AGAATGA-5', 3835, 3'-GAACCAG-5', 3840, 3'-AGAGTGA-5', 3876, 3'-AGTCTGA-5', 3924, 3'-AGTGTGG-5', 3966, 3'-GGTGTGA-5', 3971, 3'-AGAGTGG-5', 4040, 3'-AGAACAG-5', 4069, 3'-GAAATGA-5', 4094, 3'-GATTTAG-5', 4136.

Inverse complement, positive strand, negative direction there are 100: 3'-AGACTGA-5', 17, 3'-GGACCAG-5', 34, 3'-AAAACAA-5', 69, 3'-GATATGG-5', 77, 3'-AAACTGA-5', 130, 3'-AAAACAG-5', 167, 3'-GGTATAA-5', 181, 3'-GAAACAA-5', 229, 3'-GATGTAA-5', 247, 3'-AGTTCAA-5', 255, 3'-AAACCAG-5', 261, 3'-AATATGA-5', 274, 3'-AGAACAG-5', 288, 3'-AAACTGA-5', 307, 3'-GGTGCGG-5', 380, 3'-AGTGCGA-5', 448, 3'-AATACGA-5', 492, 3'-AAATTAG-5', 499, 3'-AGATTGA-5', 585, 3'-AATATGG-5', 605, 3'-AATACAA-5', 635, 3'-AAATTGG-5', 643, 3'-AGTTCGA-5', 721, 3'-AGACCAG-5', 727, 3'-AATACAA-5', 769, 3'-AAATTAG-5', 777, 3'-GATGTGG-5', 787, 3'-AGAGCGA-5', 911, 3'-GATCCAG-5', 975, 3'-AGATTGG-5', 1045, 3'-AGAGTGA-5', 1077, 3'-AAATTAG-5', 1234, 3'-AGTCTGG-5', 1356, 3'-AGAGCAA-5', 1369, 3'-AAAACAA-5', 1388, 3'-AGTGCAG-5', 1471, 3'-GGTGTGA-5', 1479, 3'-AGTGCAA-5', 1536, 3'-AGAACGA-5', 1553, 3'-AATACAG-5', 1566, 3'-GAAACAA-5', 1585, 3'-GAAATGA-5', 1663, 3'-AAAGCGG-5', 1680, 3'-GAATTAA-5', 1696, 3'-AATATGG-5', 1742, 3'-AATACAA-5', 1878, 3'-AAATTAG-5', 1887, 3'-AGACTGA-5', 1935, 3'-AGAATGG-5', 1948, 3'-AGAGCAA-5', 2021, 3'-AATGTGG-5', 2065, 3'-GGTGCAG-5', 2082, 3'-AGTGTAA-5', 2087, 3'-AGTTTGA-5', 2141, 3'-AGACCAA-5', 2147, 3'-GATACAA-5', 2180, 3'-AAAATGA-5', 2187, 3'-GGTGCGG-5', 2197, 3'-AGTTTGA-5', 2257, 3'-AGACCAG-5', 2263, 3'-AATACAA-5', 2305, 3'-AAACTAG-5', 2313, 3'-AGAGTGA-5', 2447, 3'-GATTCGG-5', 2454, 3'-AAAGCAA-5', 2474, 3'-AAAGCAA-5', 2480, 3'-AAAACAA-5', 2509, 3'-AGACCAG-5', 2600, 3'-AGTGTGG-5', 2605, 3'-AAATCAG-5', 2649, 3'-AGTGTGG-5', 2658, 3'-AAAACAA-5', 2842, 3'-AGAATGG-5', 3004, 3'-AAAATAA-5', 3013, 3'-AAACTAA-5', 3030, 3'-AGACCAG-5', 3123, 3'-AAATTAG-5', 3176, 3'-GGTGTGG-5', 3186, 3'-AGAGCAA-5', 3311, 3'-AAAACAA-5', 3330, 3'-AAATTGA-5', 3358, 3'-GAAGTGA-5', 3410, 3'-GAACTAG-5', 3462, 3'-AAACCAG-5', 3485, 3'-AATCCAG-5', 3681, 3'-GGAACAG-5', 3725, 3'-GGACTGG-5', 3749, 3'-AATGCAG-5', 3772, 3'-GATGTGG-5', 3810, 3'-GGACCAG-5', 3870, 3'-GGAGTAA-5', 3891, 3'-AGTTCAA-5', 4026, 3'-AGACCAG-5', 4032, 3'-AAAATAA-5', 4071, 3'-AATGTGA-5', 4092, 3'-AGTTCAA-5', 4177.

Inverse complement, positive strand, positive direction there are 75: 3'-GGTCCGA-5', 10, 3'-AGTCCGG-5', 92, 3'-AATCCAG-5', 152, 3'-GGTCCAG-5', 217, 3'-GGTGTGA-5', 345, 3'-GAAGCGG-5', 459, 3'-AGAATGA-5', 524, 3'-GAAGCGG-5', 595, 3'-GATGCGA-5', 652, 3'-GGTGCGA-5', 777, 3'-GGACCGG-5', 849, 3'-GGACCGG-5', 949, 3'-GGTCCGA-5', 1177, 3'-AAAGCAG-5', 1183, 3'-GAAGCGG-5', 1308, 3'-GAAGCGG-5', 1408, 3'-AATTCGG-5', 1541, 3'-GATGCGA-5', 1576, 3'-GGACTGG-5', 1662, 3'-GGTCTGA-5', 1744, 3'-GGACCGA-5', 1817, 3'-GGTCCGG-5', 1857, 3'-AGAATGG-5', 1888, 3'-GAAGTAG-5', 2110, 3'-AGTATAA-5', 2178, 3'-GGACTGG-5', 2213, 3'-GGTCTAG-5', 2230, 3'-AGAGTGG-5', 2247, 3'-AAAGTGA-5', 2304, 3'-GGTCCGA-5', 2318, 3'-AATCCGA-5', 2368, 3'-GATGTGG-5', 2430, 3'-GGACCGA-5', 2435, 3'-AGAGTGG-5', 2470, 3'-GGTACAA-5', 2475, 3'-GGACCGG-5', 2571, 3'-AATATGG-5', 2590, 3'-GGTGTGG-5', 2602, 3'-AGTTCAG-5', 2617, 3'-GGTGTGA-5', 2636, 3'-AGTCTAA-5', 2868, 3'-AAACTGG-5', 2873, 3'-GGTCCGG-5', 2878, 3'-AGACCGA-5', 2885, 3'-GGAGTAA-5', 2902, 3'-AGACTGA-5', 2945, 3'-AGACCGG-5', 2985, 3'-GGACCGG-5', 2990, 3'-GGAACAG-5', 3003, 3'-GGTCCAG-5', 3018, 3'-AGACCAA-5', 3023, 3'-AGTCCGG-5', 3036, 3'-GGACCAA-5', 3049, 3'-GAAGTAG-5', 3250, 3'-AGTGCAG-5', 3255, 3'-GGACCAG-5', 3298, 3'-AGAGTGA-5', 3317, 3'-GGTACAA-5', 3337, 3'-GGAACGG-5', 3375, 3'-AGTGTGA-5', 3507, 3'-GATCCGA-5', 3524, 3'-GGTCTGG-5', 3550, 3'-AGAGTGG-5', 3612, 3'-GGACCGG-5', 3681, 3'-AGTGTGG-5', 3824, 3'-GAACTGG-5', 4018, 3'-AAAATAG-5', 4123, 3'-GAACTAA-5', 4133, 3'-AAATCAA-5', 4138.

Initiator elements (BBCABW)

Core promoters

There are five Inrs, positive strand, negative direction: 3'-TCCACT-5', 4423, 3'-CCCAGA-5', 4448, 3'-TCCACT-5', 4459, 3'-CCCACT-5', 4485, 3'-TTCACA-5', 4531.

There are five Inrs, negative strand, positive direction: 3'-GTCAGT-5', 4271, 3'-CTCATT-5', 4309, 3'-TGCAGA-5', 4317, 3'-CCCAGA-5', 4330, 3'-CTCACT-5', 4338.

There are four Inrs, positive strand, positive direction: 3'-TCCAGT-5', 4269, 3'-CTCACT-5', 4350, 3'-CCCACT-5', 4399, 3'-CCCAGA-5', 4414.

Proximal promoters

There are five Inrs on the negative strand in the negative direction: 3'-GTCACT-5', 4200, 3'-TCCAGT-5', 4307, 3'-GTCACT-5', 4319, 3'-CCCACT-5', 4353, 3'-GTCACA-5', 4359.

There are nine Inrs on the positive strand in the negative direction: 3'-GCCAGA-5', 4233, 3'-TGCAGT-5', 4317, 3'-TGCACT-5', 4340, 3'-GCCAGT-5', 4415, 3'-TCCACT-5', 4423, 3'-CCCAGA-5', 4448, 3'-TCCACT-5', 4459, 3'-CCCACT-5', 4485, 3'-TTCACA-5', 4531.

There is six Inrs on the negative strand in the positive direction: 3'-CTCAGA-5', 4195, 3'-GTCAGT-5', 4271, 3'-CTCATT-5', 4309, 3'-TGCAGA-5', 4317, 3'-CCCAGA-5', 4330, 3'-CTCACT-5', 4338.

There is four Inrs on the positive strand in the positive direction: 3'-TCCAGT-5', 4269, 3'-CTCACT-5', 4350, 3'-CCCACT-5', 4399, 3'-CCCAGA-5', 4414.

Distal promoters

Negative strand in the negative direction there are 44: 3'-TCCATA-5', 179, 3'-CCCAGT-5', 206, 3'-CTCAGA-5', 278, 3'-GTCACT-5', 299, 3'-TTCACA-5', 322, 3'-TCCAGT-5', 439, 3'-TGCATT-5', 533, 3'-TCCAGT-5', 568, 3'-TCCAGT-5', 576, 3'-TCCAGT-5', 712, 3'-GGCAGA-5', 754, 3'-GCCACT-5', 868, 3'-GTCACT-5', 1034, 3'-CCCACT-5', 1049, 3'-CTCACT-5', 1077, 3'-GGCACA-5', 1220, 3'-GTCACT-5', 1325, 3'-GTCAGA-5', 1354, 3'-CTCAGA-5', 1444, 3'-GGCAGT-5', 1511, 3'-TGCAGA-5', 1774, 3'-GTCACT-5', 1978, 3'-GTCACA-5', 2085, 3'-TCCAGT-5', 2248, 3'-GTCACT-5', 2404, 3'-CTCACT-5', 2447, 3'-TCCAGT-5', 2585, 3'-GTCACA-5', 2603, 3'-GTCACA-5', 2656, 3'-GTCACT-5', 2739, 3'-TTCACA-5', 2860, 3'-TCCACT-5', 3144, 3'-CCCACA-5', 3184, 3'-TTCACT-5', 3410, 3'-GTCATT-5', 3480, 3'-TCCACT-5', 3825, 3'-CTCATA-5', 3829, 3'-CTCATT-5', 3891, 3'-TTCACA-5', 3939.

Positive strand in the negative direction there are 59: 3'-GCCATA-5', 39, 3'-TGCATT-5', 152, 3'-GTCACT-5', 208, 3'-GGCACA-5', 266, 3'-GGCACA-5', 518, 3'-GGCACA-5', 960, 3'-GGCAGA-5', 1023, 3'-TGCAGT-5', 1032, 3'-TTCACT-5', 1056, 3'-GGCACA-5', 1116, 3'-CTCACA-5', 1126, 3'-GGCAGA-5', 1314, 3'-TGCAGT-5', 1323, 3'-TGCACT-5', 1347, 3'-TCCAGT-5', 1352, 3'-TCCATT-5', 1378, 3'-CCCAGA-5', 1411, 3'-TGCAGT-5', 1472, 3'-CTCACT-5', 1491, 3'-CCCAGA-5', 1518, 3'-TCCAGT-5', 1532, 3'-TGCACA-5', 1719, 3'-GGCAGA-5', 1967, 3'-TGCAGT-5', 1976, 3'-GCCACT-5', 1995, 3'-TGCACT-5', 2000, 3'-TGCAGT-5', 2083, 3'-GCCAGT-5', 2211, 3'-TGCAGT-5', 2402, 3'-TGCACT-5', 2426, 3'-TCCACT-5', 2632, 3'-GCCAGT-5', 2654, 3'-GGCACA-5', 2665, 3'-TGCAGT-5', 2737, 3'-GCCACT-5', 2756, 3'-GCCATT-5', 3284, 3'-TGCACT-5', 3289, 3'-TGCAGA-5', 3431, 3'-GGCATA-5', 3445, 3'-GGCATA-5', 3451, 3'-GGCAGT-5', 3478, 3'-GGCAGA-5', 3589, 3'-GGCAGT-5', 3600, 3'-GTCAGA-5', 3625, 3'-GGCACA-5', 3632, 3'-CTCAGA-5', 3644, 3'-GCCATT-5', 3686, 3'-TCCACA-5', 3692, 3'-CCCATA-5', 3856, 3'-CTCACA-5', 3965.

Inverse complement, negative strand, negative direction there are 46: 3'-TCTGAC-5', 16, 3'-TGTGGA-5', 62, 3'-TGTGCA-5', 342, 3'-TGTGCA-5', 531, 3'-AGTGCG-5', 663, 3'-TGTGGG-5', 749, 3'-TCTGAG-5', 916, 3'-TGTGCG-5', 963, 3'-ACTGAA-5', 1052, 3'-AGTGAG-5', 1057, 3'-TCTGAG-5', 1082, 3'-TGTGGA-5', 1129, 3'-AGTGGA-5', 1171, 3'-AATGAA-5', 1298, 3'-TCTGAG-5', 1403, 3'-AGTGAC-5', 1492, 3'-TGTGAA-5', 1544, 3'-TCTGAA-5', 1617, 3'-AGTGCA-5', 1772, 3'-TCTGAC-5', 1934, 3'-AGTGCG-5', 1991, 3'-TCTGAG-5', 2026, 3'-TATGAC-5', 2162, 3'-ACTGGC-5', 2190, 3'-AGTGCG-5', 2207, 3'-TGTGAA-5', 2551, 3'-AGTGAA-5', 2578, 3'-ACTGAG-5', 2787, 3'-TATGGA-5', 2994, 3'-AGTGGG-5', 3057, 3'-AGTGAA-5', 3101, 3'-AGTGAA-5', 3240, 3'-AGTGCG-5', 3280, 3'-TCTGAC-5', 3425, 3'-TATGAC-5', 3541, 3'-TATGCG-5', 3547, 3'-TATGGA-5', 3859, 3'-TGTGGA-5', 3968, 3'-TGTGAA-5', 3983.

Inverse complement, positive strand, negative direction there are 54, 3'-ACTGAA-5', 18, 3'-TATGGG-5', 78, 3'-ACTGAA-5', 131, 3'-TATGAG-5', 275, 3'-AGTGAG-5', 300, 3'-ACTGAC-5', 308, 3'-AGTGCG-5', 447, 3'-AGTGAA-5', 472, 3'-AGTGGA-5', 523, 3'-AGTGAG-5', 1035, 3'-AGTGAG-5', 1078, 3'-AGTGGC-5', 1121, 3'-AGTGAG-5', 1326, 3'-TCTGGG-5', 1357, 3'-AGTGCA-5', 1470, 3'-ACTGCA-5', 1494, 3'-AGTGCA-5', 1535, 3'-AATGAA-5', 1581, 3'-AATGCC-5', 1634, 3'-TATGGC-5', 1743, 3'-ACTGAG-5', 1936, 3'-AATGGC-5', 1949, 3'-AGTGAG-5', 1979, 3'-ACTGCA-5', 1998, 3'-TGTGGC-5', 2066, 3'-AATGAC-5', 2188, 3'-AGTGAG-5', 2405, 3'-ACTGCA-5', 2424, 3'-AGTGAG-5', 2448, 3'-TGTGGC-5', 2606, 3'-AGTGAG-5', 2740, 3'-ACTGCA-5', 2759, 3'-TGTGCA-5', 2863, 3'-AATGGC-5', 3005, 3'-TGTGAG-5', 3268, 3'-AGTGAC-5', 3411, 3'-TGTGCA-5', 3429, 3'-TGTGCC-5', 3561, 3'-AATGGG-5', 3660, 3'-TGTGGG-5', 3712, 3'-ACTGGG-5', 3750, 3'-AATGCA-5', 3771, 3'-TCTGGA-5', 3836, 3'-ACTGCC-5', 3852, 3'-TGTGGC-5', 3960, 3'-AGTGAG-5', 4050, 3'-TGTGAG-5', 4093.

Negative strand in the positive direction there 87: 3'-TCCAGA-5', 15, 3'-GGCATT-5', 22, 3'-GTCACA-5', 155, 3'-CCCAGA-5', 204, 3'-GCCACA-5', 343, 3'-CGCAGA-5', 396, 3'-TGCAGA-5', 438, 3'-CCCAGA-5', 468, 3'-TGCACA-5', 548, 3'-TCCACA-5', 632, 3'-CGCACT-5', 686, 3'-CGCACA-5', 800, 3'-GCCAGA-5', 835, 3'-GCCACA-5', 884, 3'-GCCAGA-5', 935, 3'-GCCACA-5', 984, 3'-CGCACA-5', 1052, 3'-CGCACA-5', 1136, 3'-TGCACA-5', 1220, 3'-CCCAGT-5', 1250, 3'-CGCAGA-5', 1316, 3'-TGCACT-5', 1372, 3'-CGCAGA-5', 1416, 3'-TGCACT-5', 1472, 3'-CCCACT-5', 1502, 3'-CGCACA-5', 1556, 3'-GGCATT-5', 1702, 3'-CCCAGA-5', 1742, 3'-TGCACA-5', 1822, 3'-TCCACT-5', 1912, 3'-TGCAGA-5', 1937, 3'-GGCACT-5', 1996, 3'-CCCAGT-5', 2024, 3'-TCCACA-5', 2029, 3'-CTCAGT-5', 2060, 3'-TGCAGT-5', 2065, 3'-GCCACT-5', 2072, 3'-TTCAGT-5', 2098, 3'-CTCATA-5', 2176, 3'-TGCATT-5', 2206, 3'-GTCAGA-5', 2222, 3'-CTCAGA-5', 2239, 3'-TTCACT-5', 2304, 3'-TGCAGT-5', 2328, 3'-GTCACT-5', 2425, 3'-GTCAGA-5', 2609, 3'-CTCAGA-5', 2699, 3'-TGCAGA-5', 2721, 3'-CTCAGA-5', 2729, 3'-TGCAGA-5', 2859, 3'-CTCAGA-5', 2866, 3'-CTCATT-5', 2902, 3'-GTCACT-5', 2929, 3'-TTCAGT-5', 2936, 3'-TGCACA-5', 2962, 3'-TGCATT-5', 3072, 3'-CCCAGT-5', 3082, 3'-CCCAGA-5', 3091, 3'-TCCACA-5', 3192, 3'-CTCACA-5', 3209, 3'-GCCAGA-5', 3221, 3'-TGCAGT-5', 3232, 3'-TGCAGT-5', 3281, 3'-CTCACT-5', 3317, 3'-TGCACT-5', 3343, 3'-CCCAGT-5', 3379, 3'-CCCACT-5', 3388, 3'-GGCACA-5', 3409, 3'-TGCAGT-5', 3461, 3'-GGCAGA-5', 3473, 3'-CTCACA-5', 3505, 3'-GCCACA-5', 3705, 3'-TCCAGA-5', 3806, 3'-GTCACA-5', 3822, 3'-TGCAGA-5', 3831, 3'-TCCAGA-5', 3891, 3'-CGCAGA-5', 3916, 3'-GTCACA-5', 3954, 3'-TGCAGT-5', 3962, 3'-GGCACT-5', 4006, 3'-TCCACT-5', 4013.

Positive strand in the positive direction there are 40: 3'-TCCAGT-5', 153, 3'-CGCACA-5', 1020, 3'-CCCAGA-5', 1711, 3'-CGCACT-5', 1720, 3'-CCCACA-5', 1803, 3'-CCCAGA-5', 1958, 3'-TCCACA-5', 1969, 3'-GTCAGT-5', 2100, 3'-TCCACT-5', 2128, 3'-TCCAGT-5', 2220, 3'-TCCAGA-5', 2258, 3'-TCCACT-5', 2375, 3'-CGCAGT-5', 2423, 3'-GTCACA-5', 2464, 3'-CCCAGA-5', 2489, 3'-TTCACT-5', 2511, 3'-CGCACT-5', 2555, 3'-GTCAGT-5', 2607, 3'-CTCAGT-5', 2613, 3'-TTCAGT-5', 2618, 3'-TCCATA-5', 2642, 3'-TCCAGA-5', 3019, 3'-CTCAGA-5', 3187, 3'-TGCAGA-5', 3256, 3'-CTCACA-5', 3592, 3'-GCCAGA-5', 3608, 3'-CTCACT-5', 3712, 3'-TCCATT-5', 3731, 3'-TCCAGA-5', 3771, 3'-CCCAGT-5', 3820, 3'-GTCACT-5', 3843, 3'-CTCACT-5', 3876, 3'-TTCAGA-5', 3922, 3'-TCCACT-5', 3934, 3'-GTCACA-5', 3964, 3'-CGCAGA-5', 4056.

Inverse complement, negative strand, positive direction there are 94: 3'-AGTGGG-5', 54, 3'-TCTGCA-5', 224, 3'-TGTGAA-5', 231, 3'-ACTGCC-5', 238, 3'-TCTGAG-5', 256, 3'-TCTGGA-5', 271, 3'-ACTGGG-5', 348, 3'-AGTGCG-5', 497, 3'-AGTGCG-5', 581, 3'-AGTGCG-5', 665, 3'-ACTGCG-5', 749, 3'-TGTGGC-5', 819, 3'-ACTGCC-5', 901, 3'-TGTGGC-5', 919, 3'-ACTGCG-5', 1001, 3'-TGTGGC-5', 1023, 3'-AGTGCG-5', 1085, 3'-AGTGCG-5', 1160, 3'-AGTGCG-5', 1169, 3'-AGTGCG-5', 1253, 3'-ACTGAG-5', 1287, 3'-AATGCG-5', 1321, 3'-TCTGGC-5', 1377, 3'-TCTGCG-5', 1396, 3'-AATGCG-5', 1421, 3'-TCTGGC-5', 1477, 3'-TCTGCG-5', 1496, 3'-ACTGCA-5', 1505, 3'-AGTGCG-5', 1589, 3'-AGTGCG-5', 1725, 3'-AGTGCA-5', 1786, 3'-TGTGGA-5', 1806, 3'-TCTGGG-5', 1865, 3'-ACTGGG-5', 1954, 3'-TGTGGC-5', 1972, 3'-TCTGGC-5', 1993, 3'-AGTGCA-5', 2063, 3'-AGTGGC-5', 2068, 3'-TATGGC-5', 2160, 3'-ACTGCA-5', 2204, 3'-AGTGCA-5', 2326, 3'-TGTGCA-5', 2681, 3'-AGTGGA-5', 2712, 3'-ACTGCC-5', 2823, 3'-AATGAC-5', 2842, 3'-TCTGCA-5', 2857, 3'-TCTGGC-5', 2884, 3'-AATGGG-5', 2911, 3'-TCTGAC-5', 2944, 3'-TCTGAG-5', 2951, 3'-TGTGCA-5', 2960, 3'-TCTGGC-5', 2984, 3'-TCTGAG-5', 3007, 3'-AGTGCC-5', 3011, 3'-TATGAC-5', 3028, 3'-TCTGCA-5', 3061, 3'-AATGCA-5', 3070, 3'-ACTGGC-5', 3118, 3'-TCTGAG-5', 3124, 3'-TATGGA-5', 3163, 3'-AATGGG-5', 3169, 3'-AGTGCC-5', 3235, 3'-TATGAG-5', 3261, 3'-TCTGCA-5', 3268, 3'-TCTGCA-5', 3279, 3'-ACTGCA-5', 3320, 3'-ACTGGC-5', 3346, 3'-TCTGCC-5', 3359, 3'-TCTGGC-5', 3406, 3'-AATGCC-5', 3431, 3'-TGTGGA-5', 3437, 3'-AATGAA-5', 3442, 3'-AATGAG-5', 3446, 3'-AGTGGG-5', 3450, 3'-AGTGCA-5', 3464, 3'-AATGAC-5', 3568, 3'-TGTGAA-5', 3595, 3'-AGTGAC-5', 3713, 3'-ACTGAG-5', 3736, 3'-AATGAC-5', 3783, 3'-AATGAA-5', 3836, 3'-AGTGAG-5', 3877, 3'-TGTGAG-5', 3904, 3'-TCTGAA-5', 3925, 3'-TGTGCA-5', 3960, 3'-TGTGAC-5', 3972, 3'-AGTGGG-5', 4041, 3'-ACTGAA-5', 4090, 3'-AATGAG-5', 4095.

Inverse complement, positive strand, positive direction there are 47: 3'-TCTGAC-5', 236, 3'-TGTGAC-5', 346, 3'-TCTGCC-5', 399, 3'-TCTGGC-5', 441, 3'-AATGAA-5', 525, 3'-TGTGCA-5', 569, 3'-TGTGCG-5', 803, 3'-TGTGCG-5', 887, 3'-TGTGCG-5', 987, 3'-TGTGAC-5', 1139, 3'-TGTGCC-5', 1223, 3'-TGTGCC-5', 1559, 3'-ACTGGG-5', 1663, 3'-TGTGCC-5', 1698, 3'-TCTGAA-5', 1745, 3'-AATGGG-5', 1889, 3'-ACTGGC-5', 2214, 3'-AGTGGA-5', 2248, 3'-AGTGAG-5', 2305, 3'-AGTGGG-5', 2313, 3'-AGTGAC-5', 2341, 3'-TCTGAA-5', 2417, 3'-TGTGGA-5', 2431, 3'-TATGAA-5', 2740, 3'-TCTGGA-5', 2862, 3'-AGTGAC-5', 2930, 3'-ACTGAA-5', 2946, 3'-TGTGGG-5', 2965, 3'-ACTGAA-5', 3030, 3'-AGTGCA-5', 3254, 3'-AGTGAC-5', 3318, 3'-TGTGAG-5', 3508, 3'-TGTGGG-5', 3533, 3'-TCTGGA-5', 3551, 3'-AGTGGG-5', 3613, 3'-AGTGCC-5', 3748, 3'-ACTGGA-5', 3785, 3'-ACTGGA-5', 4019, 3'-AGTGAC-5', 4088, 3'-AGTGAG-5', 4127.

L boxes

M35 boxes

negative strand in the negative direction (from ZSCAN22 to A1BG) is SuccessablesM35--.bas, looking for 3'-TTGACA-5', 2, 3'-TTGACA-5', 477, 3'-TTGACA-5', 4399.

M boxes

Metal responsive elements

Proximal promoters

On the positive strand in the negative direction there is an MRE 3'-TGCACTC-5' at 4341.

Distal promoters

Positive strand in the negative direction there are 6: 3'-TGCGCTC-5', 891, 3'-TGCACTC-5', 1348, 3'-TGCACTC-5', 2001, 3'-TGCACTC-5', 2427, 3'-TGCACCC-5', 2762, 3'-TGCACTC-5', 3290.

Inverse complement, negative strand, negative direction there are 2: 3'-GTGTGCA-5', 531, 3'-GAGTGCA-5', 1772.

Inverse complement, positive strand, negative direction there are 2: 3'-GAGTGCA-5', 1470, 3'-GTGTGCA-5', 2863.

Negative strand in the positive direction there are 11: 3'-TGCGCCC-5', 453, 3'-TGCACAC-5', 549, 3'-TGCACAC-5', 1221, 3'-TGCGCCC-5', 1247, 3'-TGCACTC-5', 1373, 3'-TGCGCCC-5', 1399, 3'-TGCACTC-5', 1473, 3'-TGCGCCC-5', 1499, 3'-TGCGCCC-5', 1657, 3'-TGCACAC-5', 2963, 3'-TGCACCC-5', 3323.

Positive strand in the positive direction there are 2: 3'-TGCGCCC-5', 872, 3'-TGCGCCC-5', 972.

Inverse complement, negative strand, positive direction there are 10: 3'-GCGTGCA-5', 546, 3'-GCGCGCA-5', 684, 3'-GGGCGCA-5', 876, 3'-GGGCGCA-5', 976, 3'-GCGTGCA-5', 1218, 3'-GTGCGCA-5', 1523, 3'-GAGTGCA-5', 1786, 3'-GAGTGCA-5', 2326, 3'-GGGTGCA-5', 2800, 3'-GGGTGCA-5', 3883.

Motif ten elements

There are no MTEs in either promoter.

MYB recognition elements

P boxes

Pribnow boxes

  1. negative strand in the negative direction, looking for 3'-TATAAT-5', 2, 3'-TATAAT-5', 3454, 3'-TATAAT-5', 3468,
  2. negative strand in the positive direction, looking for 3'-TATAAT-5', 1, 3'-TATAAT-5', 729,
  3. positive strand in the negative direction, looking for 3'-TATAAT-5', 0,
  4. positive strand in the positive direction, looking for 3'-TATAAT-5', 0,
  5. complement, negative strand, negative direction, looking for 3'-ATATTA-5', 0,
  6. complement, negative strand, positive direction, looking for 3'-ATATTA-5', 0,
  7. complement, positive strand, negative direction, looking for 3'-ATATTA-5', 2, 3'-ATATTA-5', 3454, 3'-ATATTA-5', 3468,
  8. complement, positive strand, positive direction, looking for 3'-ATATTA-5', 1, 3'-ATATTA-5', 729,
  9. inverse complement, negative strand, negative direction, looking for 3'-ATTATA-5', 2, 3'-ATTATA-5', 272, 3'-ATTATA-5', 603,
  10. inverse complement, negative strand, positive direction, looking for 3'-ATTATA-5', 1, 3'-ATTATA-5', 727,
  11. inverse complement, positive strand, negative direction, looking for 3'-ATTATA-5', 0,
  12. inverse complement, positive strand, positive direction, looking for 3'-ATTATA-5', 0,
  13. inverse, negative strand, negative direction, looking for 3'-TAATAT-5', 0,
  14. inverse, negative strand, positive direction, looking for 3'-TAATAT-5', 0,
  15. inverse, positive strand, negative direction, looking for 3'-TAATAT-5', 2, 3'-TAATAT-5', 272, 3'-TAATAT-5', 603,
  16. inverse, positive strand, positive direction, looking for 3'-TAATAT-5', 1, 3'-TAATAT-5', 727.

Prolamin boxes

  1. negative strand in the negative direction: 1, 3'-TGTAAAG-5', 2884,
  2. negative strand in the positive direction: 1, 3'-TGAAAAG-5', 489,
  3. positive strand in the negative direction: 1, 3'-TGAAAAG-5', 1627.

Pyrimidine boxes

Pyrimidine boxes and their complements in the negative direction: 3'-CCTTTT-5' at 2459, 3'-CCTTTT-5' at 2927, and 3'-CCTTTT-5' at 2968 occur. Inverse pyrimidine boxes and their complements occur 3'-AAAAGG-5' at 105, 3'-AAAAGG-5' at 1107, 3'-AAAAGG-5' at 3345, and 3'-AAAAGG-5' at 3441.

Pyrimidine boxes in the positive direction: 3'-CCTTTT-5' at 135 and 3'-CCTTTT-5' at 291 and their complements are close to ZNF497.

Retinoblastoma control elements

R response elements

STAT5s

Proximal promoters

Negative strand in the positive direction there is 1: 3'-TTCCGGGAA-5', 4247.

Distal promoters

Positive strand in the negative direction there are 2: 3'-TTCGTTGAA-5', 3506, 3'-TTCCCTGAA-5', 3782.

Positive strand in the positive direction there is 1: 3'-TTCCATGAA-5', 128.

Synaptic Activity-Responsive Elements

TACTAAC boxes

TATA boxes

Negative strand in the negative direction there are 2: 3'-TATATATA-5' at 1600 (or -2860 nts upstream from the TSS) and 3'-TATATAAA-5' at 1602 (or -2858 nts).

Positive strand in the negative direction there are 3: 3'-TATAAAAG-5' at 184 (or -4276 nts), 3'-TATAAAAG-5' at 223 (or -4237 nts), and 3'-TATATAAA-5' at 2874 (or -1586 nts).

Inverse complement, negative strand, negative direction there are 2: 3'-TATATATA-5', 1600, 3'-TTTATATA-5', 2871.

Inverse complement, positive strand, negative direction there is 1: 3'-TTTTTATA-5', 219.

TAT boxes

Only an inverse and its complement occurs between ZSCAN22 and A1BG: 3'-TACCTAT-5' at 2996 nts from ZSCAN22.

TATCCAC boxes

None occur.

T boxes

Telomeric repeat DNA-binding factors

Copying the consensus telomeric repeat DNA-binding factor (TRF): 3'-TTAGGG-5' and putting the sequence in "⌘F" locates this sequence in the A1BG negative direction, nucleotide positions as can be found by the computer programs.

In the nucleotides between ZSCAN22 and A1BG there is at least one 3'-TTAGGG-5' beginning about 680 nucleotides from ZSCAN22 or ending at about 686 nts.

Homo sapiens genes containing these are found using Homo sapiens "TRF (TTAGGG repeat-binding factor)".

Tetradecanoylphorbol-13-acetate response elements

TGFβ control elements

TGF-β inhibitory elements

Upstream response elements

V boxes

W boxes

Proximal promoters

Inverse W boxes occur in the negative strand, negative direction of A1BG: 3'-GGTCAA-5' at 4416 and 3'-GGTCAA-5' at 4308.

W boxes occur in the positive direction, positive strand of A1BG: 3'-CTGACC-5' and its complement at 4216 and inverse W boxes occur 3'-GGTCAG-5' and its complement at 4270.

Distal promoters

A W box occurs 3'-CTGACC-5' at 3749, whereas 3'-CTGACT-5' at 17, 3'-TTGACT-5' at 130, 3'-TTGACT-5' at 307, and 3'-CTGACC-5' at 734 occur close to ZSCAN22, but 3'-CTGACT-5' at 1935 could be associated ZSCAN22 or an unknown gene between it and A1BG, along with their complements, negative strand, negative direction.

Inverse complement, positive strand, negative direction there are 5: 3'-GGTCAG-5', 440, 3'-GGTCAG-5', 577, 3'-GGTCAG-5', 713, 3'-GGTCAG-5', 2249, 3'-GGTCAG-5', 2586.

W box inverses occur 3'-GGTCAG-5' at 1353 negative direction.

W boxes 3'-AGTCAG-5' at 2101, 3'-GGTCAG-5' at 2221, 3'-AGTCAG-5' at 2608, 3'-AGTCAA-5' at 2614, and 3'-AGTCAG-5' at 2619 along with their complements, positive direction.

W boxes in the positive direction occur 3'-CTGACC-5' at 1662, 3'-CTGACC-5' at 2213, 3'-TTGACC-5' at 2873, 3'-CTGACT-5' at 2945, and 3'-TTGACC-5' at 4018 that could be associated with A1BG, along with 3'-TTGACC-5' at 1953, 3'-CTGACT-5' at 2674, and 3'-TTGACT-5' at 3735.

Inverse complement, positive strand, positive direction there are 6: 3'-GGTCAG-5', 2025, 3'-AGTCAG-5', 2099, 3'-GGTCAG-5', 2606, 3'-GGTCAG-5', 2997, 3'-GGTCAG-5', 3083, 3'-GGTCAA-5', 3380.

X boxes

There are no X boxes in either promoter.

X core promoter elements

  1. negative strand in the negative direction, looking for 3'-G/A/T-G/C-G-T/C-G-G-G/A-A-G/C-A/C-5', 1, 3'-TGGTGGGACC-5', 3744,
  2. negative strand in the positive direction, looking for 3'-G/A/T-G/C-G-T/C-G-G-G/A-A-G/C-A/C-5', 0,
  3. positive strand in the negative direction, looking for 3'-G/A/T-G/C-G-T/C-G-G-G/A-A-G/C-A/C-5', 0,
  4. positive strand in the positive direction, looking for 3'-G/A/T-G/C-G-T/C-G-G-G/A-A-G/C-A/C-5', 0,
  5. complement, negative strand, negative direction, looking for 3'-C/A/T-G/C-C-A/G-C-C-C/T-T-G/C-G/T-5', 0,
  6. complement, negative strand, positive direction, looking for 3'-C/A/T-G/C-C-A/G-C-C-C/T-T-G/C-G/T-5', 0,
  7. complement, positive strand, negative direction, looking for 3'-C/A/T-G/C-C-A/G-C-C-C/T-T-G/C-G/T-5', 1, 3'-ACCACCCTGG-5', 3744,
  8. complement, positive strand, positive direction, looking for 3'-C/A/T-G/C-C-A/G-C-C-C/T-T-G/C-G/T-5', 0,
  9. inverse complement, negative strand, negative, looking for 3'-G/T-G/C-T-C/T-C-C-A/G-C-G/C-C/A/T-5', 0,
  10. inverse complement, negative strand, positive direction, looking for 3'-G/T-G/C-T-C/T-C-C-A/G-C-G/C-C/A/T-5', 0,
  11. inverse complement, positive strand, negative direction, looking for 3'-G/T-G/C-T-C/T-C-C-A/G-C-G/C-C/A/T-5', 1, 3'-GCTCCCACCT-5', 392,
  12. inverse complement, positive strand, positive direction, looking for 3'-G/T-G/C-T-C/T-C-C-A/G-C-G/C-C/A/T-5', 0,
  13. inverse, negative strand, negative direction, looking for 3'-A/C-G/C-A-G/A-G-G-T/C-G-G/C-G/A/T-5', 1, 3'-CGAGGGTGGA-5', 392,
  14. inverse, negative strand, positive direction, looking for 3'-A/C-G/C-A-G/A-G-G-T/C-G-G/C-G/A/T-5', 1, 3'-CCAGGGTGGG-5', 102,
  15. inverse, positive strand, negative direction, looking for 3'-A/C-G/C-A-G/A-G-G-T/C-G-G/C-G/A/T-5', 0,
  16. inverse, positive strand, positive direction, looking for 3'-A/C-G/C-A-G/A-G-G-T/C-G-G/C-G/A/T-5', 0.

Y boxes

There are no Y boxes in either promoter.

Z boxes

Hypotheses

  1. Downstream core promoters may work as transcription factors even as their complements or inverses.
  2. In addition to the DNA binding sequences listed above, the transcription factors that can open up and attach through the local epigenome need to be known and specified.

See also

References

  1. "Entrez Gene: Alpha-1-B glycoprotein". Retrieved 2012-11-09.
  2. 2.0 2.1 "A1BG alpha-1-B glycoprotein". Retrieved May 10, 2013.
  3. 3.0 3.1 HGNC2019 (10 December 2019). "ZSCAN22 zinc finger and SCAN domain containing 22 [ Homo sapiens (human) ]". U.S. National Library of Medicine, 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information. Retrieved 2019-12-18.
  4. 4.0 4.1 HGNC2019 (10 December 2019). "MIR6806 microRNA 6806 [ Homo sapiens (human) ]". U.S. National Library of Medicine, 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information. Retrieved 2019-12-18.
  5. Jag123 (7 March 2005). "antigen". San Francisco, California: Wikimedia Foundation, Inc. Retrieved 7 March 2020.
  6. SemperBlotto (21 April 2008). "immunogen". San Francisco, California: Wikimedia Foundation, Inc. Retrieved 8 March 2020.
  7. 7.0 7.1 7.2 C. Michael Gibson (27 April 2008). "Antigen". Boston, Massachusetts: WikiDoc Foundation. Retrieved 8 March 2020.
  8. Williamsayers79 (26 February 2007). "antibody". San Francisco, California: Wikimedia Foundation, Inc. Retrieved 7 March 2020.
  9. Jag123 (7 March 2005). "antibody". San Francisco, California: Wikimedia Foundation, Inc. Retrieved 7 March 2020.
  10. Eleonora Market, F. Nina Papavasiliou (2003) V(D)J Recombination and the Evolution of the Adaptive Immune System PLoS Biology 1(1): e16.
  11. Charles A. Janeway, Jr; et al. (2001). Immunobiolog (5th ed. ed.). Garland Publishing. ISBN 0-8153-3642-X.
  12. SemperBlotto (25 February 2006). "immunoglobulin". San Francisco, California: Wikimedia Foundation, Inc. Retrieved 7 March 2020.
  13. SemperBlotto (28 April 2008). "immunoglobulin". San Francisco, California: Wikimedia Foundation, Inc. Retrieved 7 March 2020.
  14. 14.0 14.1 RefSeqJuly2008 (10 December 2019). "A1BG alpha-1-B glycoprotein [ Homo sapiens (human) ]". U.S. National Library of Medicine, 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information. Retrieved 2019-12-18.
  15. Tian M, Cui YZ, Song GH, Zong MJ, Zhou XY, Chen Y, Han JX (2008). "Proteomic analysis identifies MMP-9, DJ-1 and A1BG as overexpressed proteins in pancreatic juice from pancreatic ductal adenocarcinoma patients". BMC Cancer. 8: 241. doi:10.1186/1471-2407-8-241. PMC 2528014. PMID 18706098.
  16. 16.0 16.1 16.2 16.3 HGNC2019 (10 December 2019). "A1BG-AS1 A1BG antisense RNA 1 [ Homo sapiens (human) ]". U.S. National Library of Medicine, 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information. Retrieved 2019-12-18.
  17. 17.0 17.1 17.2 17.3 17.4 17.5 Noriaki Ishioka, Nobuhiro Takahashi, and Frank W. Putnam (April 1986). "Amino acid sequence of human plasma 𝛂1B-glycoprotein: Homology to the immunoglobulin supergene family" (PDF). Proceedings of the National Academy of Sciences USA. 83 (8): 2363–7. doi:10.1073/pnas.83.8.2363. PMID 3458201. Retrieved 9 March 2020.
  18. 18.0 18.1 Katrina M. Morris, Denis O’Meally, Thiri Zaw, Xiaomin Song, Amber Gillett, Mark P. Molloy, Adam Polkinghorne, and Katherine Belova (7 October 2016). "Characterisation of the immune compounds in koala milk using a combined transcriptomic and proteomic approach". Scientific Reports. 6: 35011. doi:10.1038/srep35011. PMID 27713568. Retrieved 14 March 2020.
  19. 24.98.118.180 (28 February 2007). "species". San Francisco, California: Wikimedia Foundation, Inc. Retrieved 25 March 2020.
  20. 20.0 20.1 Peter coxhead (22 August 2018). "Species". San Francisco, California: Wikimedia Foundation, Inc. Retrieved 25 March 2020.
  21. Chiswick Chap (1 December 2016). "Species". San Francisco, California: Wikimedia Foundation, Inc. Retrieved 25 March 2020.
  22. 22.0 22.1 22.2 22.3 "AceView: A1BG". Retrieved May 11, 2013.
  23. Pdeitiker (26 July 2008). "variant". San Francisco, California: Wikimedia Foundation, Inc. Retrieved 25 March 2020.
  24. SemperBlotto (6 January 2007). "isoform". San Francisco, California: Wikimedia Foundation, Inc. Retrieved 2 December 2018.
  25. 72.178.245.181 (30 November 2008). "isoform". San Francisco, California: Wikimedia Foundation, Inc. Retrieved 2 December 2018.
  26. H Eiberg, ML Bisgaard, J Mohr (1 December 1989). "Linkage between alpha 1B-glycoprotein (A1BG) and Lutheran (LU) red blood group system: assignment to chromosome 19: new genetic variants of A1BG". Clinical genetics. 36 (6): 415–8. PMID 2591067. Retrieved 2017-10-08.
  27. John R. Stehle Jr., Mark E. Weeks, Kai Lin, Mark C. Willingham, Amy M. Hicks, John F. Timms, Zheng Cui (January 2007). "Mass spectrometry identification of circulating alpha-1-B glycoprotein, increased in aged female C57BL/6 mice". Biochimica et Biophysica Acta (BBA) - General Subjects. 1770 (1): 79–86. Retrieved 2017-10-08.
  28. 28.0 28.1 28.2 28.3 28.4 Caitrin W. McDonough, Yan Gong, Sandosh Padmanabhan, Ben Burkley, Taimour Y. Langaee, Olle Melander, Carl J. Pepine, Anna F. Dominiczak, Rhonda M. Cooper-DeHoff, Julie A. Johnson (June 2013). "Pharmacogenomic Association of Nonsynonymous SNPs in SIGLEC12, A1BG, and the Selectin Region and Cardiovascular Outcomes" (PDF). Hypertension. 62 (1): 48–54. doi:10.1161/HYPERTENSIONAHA.111.00823. PMID 23690342. Retrieved 2017-10-08.
  29. DTLHS (10 January 2018). "genotype". San Francisco, California: Wikimedia Foundation, Inc. Retrieved 25 March 2020.
  30. SemperBlotto (22 October 2005). "genotype". San Francisco, California: Wikimedia Foundation, Inc. Retrieved 25 March 2020.
  31. Widsith (28 March 2012). "polymorphism". San Francisco, California: Wikimedia Foundation, Inc. Retrieved 25 March 2020.
  32. 217.105.66.98 (8 September 2016). "allele". San Francisco, California: Wikimedia Foundation, Inc. Retrieved 25 March 2020.
  33. 138.130.33.215 (7 April 2004). "allele". San Francisco, California: Wikimedia Foundation, Inc. Retrieved 25 March 2020.
  34. 34.0 34.1 B. Gahne, R. K. Juneja, and A. Stratil (June 1987). "Genetic polymorphism of human plasma alpha 1B-glycoprotein: phenotyping by immunoblotting or by a simple method of 2-D electrophoresis". Human Genetics. 76 (2): 111–5. doi:10.1007/bf00284904. PMID 3610142. Retrieved 25 March 2020.
  35. 35.0 35.1 R.K. Juneja, N. Saha, B. Gahne and J.S.H. Tay (1989). "Distribution of Plasma Alpha-1-B-Glycoprotein Phenotypes in Several Mongoloid Populations of East Asia". Human Heredity. 39: 218–222. doi:10.1159/000153863. Retrieved 25 March 2020.
  36. 24.235.196.118 (23 September 2007). "phenotype". San Francisco, California: Wikimedia Foundation, Inc. Retrieved 2016-10-04.
  37. SemperBlotto (14 February 2005). "phenotype". San Francisco, California: Wikimedia Foundation, Inc. Retrieved 2016-10-04.
  38. N2e (3 July 2008). "phenotype". San Francisco, California: Wikimedia Foundation, Inc. Retrieved 2016-10-04.
  39. Mardiaty Iryani Abdullah, Ching Chin Lee, Sarni Mat Junit, Khoon Leong Ng, and Onn Haji Hashim (13 September 2016). "Tissue and serum samples of patients with papillary thyroid cancer with and without benign background demonstrate different altered expression of proteins". Peer J. 4: e2450. doi:10.7717/peerj.2450. PMID 27672505. Retrieved 15 March 2020.
  40. 40.0 40.1 40.2 40.3 Udby L, Sørensen OE, Pass J, Johnsen AH, Behrendt N, Borregaard N, Kjeldsen L. (2004). "Cysteine-rich secretory protein 3 is a ligand of alpha1B-glycoprotein in human plasma". Biochemistry. 43 (40): 12877–86. doi:10.1021/bi048823e. PMID 15461460. Unknown parameter |month= ignored (help); |access-date= requires |url= (help)
  41. "The Opossum: Our Marvelous Marsupial, The Social Loner". Wildlife Rescue League.
  42. Journal Of Venomous Animals And Toxins – Anti-Lethal Factor From Opossum Serum Is A Potent Antidote For Animal, Plant And Bacterial Toxins. Retrieved 2009-12-29.
  43. 43.0 43.1 B Haendler, J Krätzschmar, F Theuring and W D Schleuning (1993). "Transcripts for cysteine-rich secretory protein-1 (CRISP-1; DE/AEG) and the novel related CRISP-3 are expressed under androgen control in the mouse salivary gland". Endocrinology. 133 (1): 192–8. doi:10.1210/en.133.1.192. PMID 8319566. Retrieved 2012-02-20. Unknown parameter |month= ignored (help)
  44. 44.0 44.1 HGNC2019 (10 December 2019). "ZNF497 zinc finger protein 497 [ Homo sapiens (human) ]". U.S. National Library of Medicine, 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information. Retrieved 2019-12-18.
  45. 45.0 45.1 HGNC2019 (10 December 2019). "LOC100419840 zinc finger protein 446 pseudogene [ Homo sapiens (human) ]". U.S. National Library of Medicine, 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information. Retrieved 2019-12-18.
  46. 46.0 46.1 HGNC2019 (10 December 2019). "LOC105372483 uncharacterized LOC105372483 [ Homo sapiens (human) ]". U.S. National Library of Medicine, 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information. Retrieved 2019-12-18.
  47. 47.0 47.1 HGNC2019 (10 December 2019). "RNA5SP473 RNA, 5S ribosomal pseudogene 473 [ Homo sapiens (human) ]". U.S. National Library of Medicine, 8600 Rockville Pike, Bethesda MD, 20894 USA: National Center for Biotechnology Information. Retrieved 2019-12-18.
  48. 48.0 48.1 PA Johnson, D Bunick, NB Hecht (1991). "Protein Binding Regions in the Mouse and Rat Protamine-2 Genes" (PDF). Biology of Reproduction. 44 (1): 127–134. Retrieved 6 April 2019.
  49. Amber Paratore Sanchez and Kumar Sharma (July 2009). "Transcription factors in the pathogenesis of diabetic nephropathy". Expert Reviews in Molecular Medicine. 11: e13. doi:10.1017/S1462399409001057. Retrieved 1 October 2018.

External links

{{Phosphate biochemistry}}Template:Sisterlinks