CMU LibGuides: Data Management for Research: Metadata

Metadata

Metadata are data that provide descriptive information (content, context, quality, structure, and accessibility) about a data product and enable others to search for and use the data product. In a lab setting, much of the content used to describe data is initially collected in a notebook; metadata is a more formal, sharable expression of this information. It can include content such as contact information, geographic locations, details about units of measure, abbreviations or codes used in the dataset, instrument and protocol information, survey tool details, provenance and version information and much more. Where no appropriate, formal metadata standard exists, for internal use, writing “readme” style metadata is an appropriate strategy.

The Digital Curation Center provides a catalog of common metadata standards, organized by discipline.

Some specific examples of metadata standards, both general and domain-specific are:

Dublin Core

Domain agnostic, basic and widely used metadata standard

DDI (Data Documentation Initiative)

Common standard for social, behavioral and economic sciences, including survey data

EML (Ecological Metadata Language)

Specific for ecology disciplines

ISO 19115

For describing geospatial information

FGDC-CSDGM (Federal Geographic Data Committee's Content Standard for Digital Geospatial Metadata

For describing geospatial information

MINSEQE (MINimal information about high throughput SEQeuencing Experiments)

Genomics standard

FITS (Flexible Image Transport System)

Astronomy digital file standard that includes structured, embedded metadata

Metadata formats

A text or HTML document: You can also use a text document to create a data dictionary. A data dictionary records information about metadata elements, sub-elements and attributes and provides sample content. It is a good way to record what type of metadata standard you are using, and if there is any variation from the standard. Read more about metadata standards below.
An XML document either linked to the data files, or embedded within it: If you are using XML, it is probably already following an established metadata standard. For example, the XML tags, such as <dc:title>, correspond to a set of defined Dublin Core elements. Dublin Core is one of the most common metadata standards, and may meet most of your metadata needs. Read more about the Dublin Core Element Set, version 1.1.

Contact

If you are unsure of the data management policies or practices best suited for your research, or if you have any other questions, please contact the University Libraries Data and Publishing Services team.