Simply put, metadata can be known as "data about data." But it's so much more complex (and fascinating) than that! Metadata are data that provide descriptive information (content, context, quality, structure, and accessibility) about a data product and enable others to search for and use the data product. In a lab setting, much of the content used to describe data is initially collected in a notebook; metadata is a more formal, sharable expression of this information. It can include content such as contact information, geographic locations, details about units of measure, abbreviations or codes used in the dataset, instrument and protocol information, survey tool details, provenance and version information and much more. Where no appropriate, formal metadata standard exists, for internal use, writing “readme” style metadata is an appropriate strategy.
The Digital Curation Center provides a catalog of common metadata standards, organized by discipline.
Some specific examples of metadata standards, both general and domain-specific are:
Metadata can exist in a variety of formats, including:
A text or HTML document
You can also use a text document to create a data dictionary. A data dictionary records information about metadata elements, sub-elements and attributes and provides sample content. It is a good way to record what type of metadata standard you are using, and if there is any variation from the standard. Read more about metadata standards below.
An XML document either linked to the data files, or embedded within it
If you are using XML, it is probably already following an established metadata standard. For example, the XML tags, such as <dc:title>, correspond to a set of defined Dublin Core elements. Dublin Core is one of the most common metadata standards, and may meet most of your metadata needs. Read more about the Dublin Core Element Set, version 1.1.