Chemical formula

Jump to navigation Jump to search

Editor-In-Chief: C. Michael Gibson, M.S., M.D. [1]

A chemical formula is a concise way of expressing information about the atoms that constitute a particular chemical compound. A chemical formula is also a short way of showing how a chemical reaction occurs. For molecular compounds, it identifies each constituent element by its chemical symbol and indicates the number of atoms of each element found in each discrete molecule of that compound. If a molecule contains more than one atom of a particular element, this quantity is indicated using a subscript after the chemical symbol (although 19th-century books often used superscripts). For ionic compounds and other non-molecular substances, the subscripts indicate the ratio of elements in the empirical formula.

This system for writing chemical formulas was invented by the 19th-century Swedish chemist Jons Jakob Berzelius.

Molecular and structural formula

For example methane, a simple molecule consisting of one carbon atom bonded to four hydrogen atoms has the chemical formula:


and glucose with six carbon atoms, twelve hydrogen atoms and six oxygen atoms has the chemical formula:


A chemical formula supplies information about the types and spatial arrangement of bonds in the chemical, though it does not necessarily specify the exact isomer. For example ethane consists of two carbon atoms single-bonded to each other, with each carbon atom having three hydrogen atoms bonded to it. Its chemical formula can be rendered as CH3CH3. If there were a double bond between the carbon atoms (and thus each carbon only had two hydrogens), the chemical formula may be written: CH2CH2, and the fact that there is a double bond between the carbons is assumed. However, a more explicit and correct method is to write H2C:CH2 or H2C=CH2. The two dots or lines indicate that a double bond connects the atoms on either side of them.

A triple bond may be expressed with three dots or lines, and if there may be ambiguity, a single dot or line may be used to indicate a single bond.

Molecules with multiple functional groups that are the same may be expressed in the following way: (CH3)3CH. However, this implies a different structure from other molecules that can be formed using the same atoms (isomers). The formula (CH3)3CH implies a chain of three carbon atoms, with the middle carbon atom bonded to another carbon:

Carbon chain
Carbon chain

and the remaining bonds on the carbons all leading to hydrogen atoms. However, the same number of atoms (10 hydrogens and 4 carbons, or C4H10) may be used to make a straight chain: CH3CH2CH2CH3.

The alkene but-2-ene has two isomers which the chemical formula CH3CH=CHCH3 does not identify. The relative position of the two methyl groups must be indicated by additional notation denoting whether the methyl groups are on the same side of the double bond (cis or Z) or on the opposite sides from each other.(trans or E)


For polymers, parentheses are placed around the repeating unit. For example, a hydrocarbon molecule that is described as: CH3(CH2)50CH3, is a molecule with 50 repeating units. If the number of repeating units is unknown or variable, the letter n may be used to indicate this: CH3(CH2)nCH3.


For ions, the charge on a particular atom may be denoted with a right-hand superscript. For example Na+, or Cu2+. The total charge on a charged molecule or a polyatomic ion may also be shown in this way. For example: hydronium, H3O+ or sulfate, SO42-.


Although isotopes are more relevant to nuclear chemistry or stable isotope chemistry than to conventional chemistry, different isotopes may be indicated with a left-hand superscript in a chemical formula. For example, the phosphate ion containing radioactive phosphorus-32 is 32PO43-. Also a study involving stable isotope ratios might include 18O:16O.

A left-hand subscript is sometimes used to indicate redundantly, for convenience, the atomic number.

Empirical formula

In chemistry, the empirical formula of a chemical is a simple expression of the relative number of each type of atom or ratio of the elements in the compound. Empirical formulas are the standard for ionic compounds, such as CaCl2, and for macromolecules, such as SiO2. An empirical formula makes no reference to isomerism, structure, or absolute number of atoms. The term empirical refers to the process of elemental analysis, a technique of analytical chemistry used to determine the relative percent composition of a pure chemical substance by element.

For example hexane has a molecular formula of C6H14, or structurally CH3CH2CH2CH2CH2CH3, implying that it has a chain structure of 6 carbon atoms, and 14 hydrogen atoms. However, the empirical formula for hexane is C3H7. Likewise the empirical formula for hydrogen peroxide, H2O2, is simply HO expressing the 1:1 ratio of component elements.

Non-stoichiometric formulas

Main article: Non-stoichiometric compound

Chemical formulas most often use natural numbers for each of the elements. However, there is a whole class of compounds, called non-stoichiometric compounds, that cannot be represented by well-defined natural numbers. Such a formula might be written using real numbers, as in Fe0.95O, or it might include a variable part represented by a letter, as in Fe1–xO.

General forms for organic compounds

Chemical formula used for a series of compounds that differ from each other by a constant unit is called general formula. Such a series is called the homologous series, while its members are called homologs. The Hill system is a common convention for writing and sorting formulas.

Hill System

The Hill system is a system of writing chemical formulas such that the number of carbon atoms in a molecule is indicated first, the number of hydrogen atoms next, and then the number of all other chemical elements subsequently, in alphabetical order. When the formula contains no carbon, all the elements, including hydrogen, are listed alphabetically.

By sorting formulas according to the number of atoms of each element present in the formula according to these rules, with differences in earlier elements or numbers being treated as more significant than differences in any later element or number — like sorting text strings into lexicographic order — it is possible to collate chemical formulas into what is known as Hill system order.

The Hill system was first published by Edwin A. Hill of the United States Patent Office in 1900.


The following formulas are written using the Hill system, and listed in Hill order:

  1. BrH
  2. BrI
  3. CH3I
  4. C2H5Br
  5. HI

See also

als:Summenformel ar:صيغة كيميائية br:Formulenn gimiek ca:Fórmula química cs:Chemický vzorec da:Sumformel de:Summenformel eo:Kemia formulo gl:Fórmula química io:Kruda formulo id:Formula kimia it:Formula chimica he:כתיב כימי lv:Ķīmiskā formula lt:Cheminė formulė hu:Vegyképlet nl:Molecuulformule no:Kjemisk formel nn:Kjemisk formel nds:Chemisch Formel simple:Molecular formula sk:Chemický vzorec sl:Kemijska formula fi:Kemiallinen kaava sv:Summaformel tl:Pormulang kimikal th:สูตรเคมี uk:Хімічна формула

Template:WS Template:Jb1