Research Data Services @ Carnegie Mellon University Libraries
A comprehensive guide to implementing FAIR principles in mathematical research
The FAIR principles, first published in 2016 by Wilkinson et al., provide a framework for optimizing the reuse of scientific data. FAIR stands for Findable, Accessible, Interoperable, and Reusable.
"The FAIR principles focus specifically on data management and machine-friendly accessibility aspects with emphasis on metadata, rather than broader research transparency."
These principles are designed to support knowledge discovery and innovation both by humans and machines, to increase the value of accumulated digital research objects, and to enable automation in scientific processes.
↑ Back to topMathematics research generates a wide variety of data types that benefit from structured management approaches. The implementation of FAIR principles in mathematics:
As noted in the Scientific Data journal (2024): "The main goal is to make the scientific community more transparent and efficient by making it significantly easier to re-use previous research results. This openness not only facilitates collaboration within the mathematical community but also enhances cross-fertilization with other fields, fostering interdisciplinary approaches that can lead to new discoveries and innovations."
↑ Back to topMathematical research encompasses diverse data types that require specialized management approaches:
Category | Types | Description |
---|---|---|
Symbolic data | Formulae, Theorems, Proofs, Functions | Expressions and notations for abstract reasoning and articulation of mathematical concepts |
Numeric data | Integer sequences, Matrices, Tensors, Finite lattices | Numerical values and organized structures essential for analytical processes, representation, and computational challenges |
Geometric data | Curves, Surfaces, High-dimensional objects, Polytopes | Mathematical and algorithmic depictions of shapes and structures pivotal to geometry, topology, and related disciplines |
Models | Math models, BioModels | Simplified versions of real-world phenomena designed for predictive analysis and hypothesis testing |
Observational data | Simulations, Experiments, Observations | Data gathered from direct observations, experiments, and simulations to explore and verify natural phenomena |
Text data | Research papers, encyclopedia entries | Scholarly articles and academic books serving as primary sources for mathematical research and discussion |
The first step in data reuse is to find it. Metadata and data should be easy to discover for both humans and machines.
Principle | Description | Mathematical Implementation |
---|---|---|
F1 | (Meta)data are assigned a globally unique and persistent identifier | Assign DOIs to mathematical papers, proofs, datasets, and software |
F2 | Data are described with rich metadata | Include comprehensive information about mathematical objects, theorems, or datasets |
F3 | Metadata clearly and explicitly include the identifier of the data it describes | Ensure metadata records explicitly reference the persistent ID |
F4 | (Meta)data are registered or indexed in a searchable resource | Submit to indexed repositories like arXiv or specialized mathematical databases |
Once found, data needs to be accessible, potentially with appropriate authorization.
Principle | Description | Mathematical Implementation |
---|---|---|
A1 | (Meta)data are retrievable by their identifier using a standardized communications protocol | Ensure data is available via standard web protocols (HTTP/HTTPS) |
A1.1 | The protocol is open, free, and universally implementable | Use non-proprietary access methods compatible with mathematical software |
A1.2 | The protocol allows for authentication and authorization where necessary | Implement standard authentication methods for sensitive mathematical data |
A2 | Metadata remains accessible even when the data is no longer available | Preserve metadata about mathematical objects even if implementations change |
Data should integrate with other data and be usable in different applications or workflows.
Principle | Description | Mathematical Implementation |
---|---|---|
I1 | (Meta)data use a formal, accessible, shared, and broadly applicable language for knowledge representation | Use standard formats like MathML, LaTeX, or other established notations |
I2 | (Meta)data use vocabularies that follow FAIR principles | Implement mathematical subject classification (MSC) and other controlled vocabularies |
I3 | (Meta)data include qualified references to other (meta)data | Include references to related mathematical objects, theorems, or datasets with standardized identifiers |
The ultimate goal: data should be well-described to enable reuse in different settings.
Principle | Description | Mathematical Implementation |
---|---|---|
R1 | (Meta)data are richly described with a plurality of accurate and relevant attributes | Provide detailed contextual information about the mathematical content |
R1.1 | (Meta)data are released with a clear and accessible data usage license | Choose appropriate open licenses (e.g., CC-BY, CC0) for mathematical content |
R1.2 | (Meta)data are associated with detailed provenance | Document the origin, development history, and derivation of mathematical results |
R1.3 | (Meta)data meet domain-relevant community standards | Adhere to established conventions in the respective mathematical subdiscipline |
The implementation of FAIR principles in mathematics faces several discipline-specific challenges:
These challenges are reflected in the current state of mathematical data repositories, with most systems fulfilling the FAIR principles for findability and accessibility but showing lower compliance rates for interoperability and reusability.
↑ Back to topRepository | Focus | Features | URL |
---|---|---|---|
arXiv | Preprints in mathematics and related fields | DOIs, long-term preservation, high visibility | arxiv.org |
OEIS | Integer sequences | Standardized format, comprehensive metadata | oeis.org |
FindStat | Combinatorial statistics | Unique identifiers, integration with computational tools | findstat.org |
Archive of Formal Proofs | Formally verified proofs | Peer-reviewed, machine-checkable proofs | isa-afp.org |
Zenodo | Multidisciplinary (incl. mathematics) | DOIs, versioning, integration with GitHub | zenodo.org |
SuiteSparse Matrix Collection | Sparse matrices | Standardized formats, rich metadata | sparse.tamu.edu |
BioModels | Mathematical models in biology | Unique identifiers, standardized formats | ebi.ac.uk/biomodels |
Harvard Dataverse | Multidisciplinary data repository | DOIs, versioning, rich metadata capabilities | dataverse.harvard.edu |
Use this checklist to assess and improve the FAIR compliance of your mathematical research data:
Developed by the Research & Data Services Team @ CMU Libraries
Last Updated: May 15, 2025