Data

Jump to navigation Jump to search

Data in everyday language is a synonym for information.[1] In the exact sciences there is a clear distinction between data and information, where data is a measurement that can be disorganized and when the data becomes organized it becomes information. Data may relate to reality, or to fiction as in a fictional movie. Data about reality consists of propositions. A large class of practically important propositions are measurements or observations of a variable. Such propositions may comprise numbers, words or images.

Etymology

The word data is the plural of Latin datum, neuter past participle of dare, "to give", hence "something given". The past participle of "to give" has been used for millennia, in the sense of a statement accepted at face value; one of the works of Euclid, circa 300 BC, was the Dedomena (in Latin, Data). In discussions of problems in geometry, mathematics, engineering, and so on, the terms givens and data are used interchangeably. Such usage is the origin of data as a concept in computer science: data are numbers, words, images, etc., accepted as they stand. Pronounced dey-tuh, dat-uh, or dah-tuh.

Experimental data are data generated within the context of a scientific investigation.

Usage in English

In English, the word datum is still used in the general sense of "something given", and more specifically in cartography, geography, geology, NMR and drafting to mean a reference point, reference line, or reference surface. More generally speaking, any measurement or result can be called a (single) datum, but data point is more common[3]. Both datums (see usage in datum article) and the originally Latin plural data are used as the plural of datum in English, but data is more commonly treated as a mass noun and used in the singular, especially in day-to-day usage. For example, "This is all the data from the experiment". This usage would be inconsistent with the rules of Latin grammar, which would instead suggest "These are all the data from the experiment", but these are English sentences, so Latin grammar rules do not apply. Many British and UN academic, scientific, and professional style guides (e.g., see page 43 of the World Health Organization Style Guide) request that authors treat data as a plural noun. Nevertheless, it is now usually treated as a singular mass noun in both informal and educated usage, but usage in scientific publications shows a strong UK/U.S divide. U.S. usage prefers treating data in the singular in all contexts, including serious and academic publishing.[2] UK usage now widely accepts treating data as singular in standard English[3], including educated everyday usage[4] at least in non-scientific use.[4] UK scientific publishing usually still prefers treating it as a plural.[5]. Some UK university style guides recommend using data for both singular and plural use[6] and some recommend treating it only as a singular in connection with computers.[7]

Uses of data in science and computing

Raw data are numbers, characters, images or other outputs from devices to convert physical quantities into symbols, in a very broad sense. Such data are typically further processed by a human or input into a computer, stored and processed there, or transmitted (output) to another human or computer. Raw data is a relative term; data processing commonly occurs by stages, and the "processed data" from one stage may be considered the "raw data" of the next.

Mechanical computing devices are classified according to the means by which they represent data. An analog computer represents a datum as a voltage, distance, position, or other physical quantity. A digital computer represents a datum as a sequence of symbols drawn from a fixed alphabet. The most common digital computers use a binary alphabet, that is, an alphabet of two characters, typically denoted "0" and "1". More familiar representations, such as numbers or letters, are then constructed from the binary alphabet.

Some special forms of data are distinguished. A computer program is a collection of data, which can be interpreted as instructions. Most computer languages make a distinction between programs and the other data on which programs operate, but in some languages, notably Lisp and similar languages, programs are essentially indistinguishable from other data. It is also useful to distinguish metadata, that is, a description of other data. A similar yet earlier term for metadata is "ancillary data." The prototypical example of metadata is the library catalog, which is a description of the contents of books.

Meaning of data, information and knowledge

The terms information and knowledge are frequently used for overlapping concepts. These three concepts are ill- or ambiguously defined in the subject matter literature . However, in recent interdisciplinary research a few independent specializations of these terms have been proposed.

See also

References

  1. http://www.dict.org/bin/Dict?Form=Dict1&Query=data&Strategy=*&Database=*
  2. "Sometimes scientists think of data as plural, as in These data do not support the conclusions. But more often scientists and researchers think of data as a singular mass entity like information, and most people now follow this in general usage."[1]
  3. New Oxford Dictionary of English, 1999
  4. "...in educated everyday usage as represented by the Guardian newspaper, it is nowadays most often used as a singular."[2]

Template:FOLDOC

af:Data ar:بيانات az:Verilənlər be:Даныя bs:Podatak cs:Data da:Data (datalogi) de:Daten el:Δεδομένα eo:Dateno fa:داده ko:데이터 hr:Podatak id:Data it:Dato he:נתונים hu:Adat mk:Податок nl:Gegeven simple:Data sl:Podatek sr:Податак su:Data fi:Data sv:Data (mönster) tl:Datos ta:தரவு th:ข้อมูล uk:Дані Template:WikiDoc Sources