FAIR Principles in Neuroscience and Psychology
This guide presents a comprehensive overview of the FAIR principles—Findable, Accessible, Interoperable, and Reusable—as they apply to neuroscience and psychology data management. As neuroscience evolves with increasingly complex and multimodal data types, implementing FAIR practices has become essential for scientific progress, reproducibility, and effective collaboration. This document outlines the key components of FAIR implementation across different stakeholder groups and provides practical guidance for researchers, laboratories, and institutions.
1. Introduction to FAIR in Neuroscience
The transformation of neuroscience from a closed to an open science has accelerated in recent years. The FAIR data principles, introduced in 2016, represent a pivotal framework that establishes minimum requirements for scientific data to be useful: data should be Findable, Accessible, Interoperable, and Reusable.
Neuroscience presents unique challenges for FAIR implementation due to:
- Diversity of data types (imaging, electrophysiology, behavioral, genetic, etc.)
- Multiple scales of investigation (molecular to systems level)
- Variety of model systems and experimental paradigms
- Complexity of data acquisition and analysis workflows
- Large volumes of data generated by modern techniques
Despite early resistance to data sharing in neuroscience compared to fields like genomics and structural biology, the field has made significant progress in the past decade. This has been driven by:
- Large prospective data sharing initiatives (ADNI, Allen Brain Atlas, Human Connectome Project)
- National and international brain projects with mandated open data policies
- Increasing concern about reproducibility in neuroscience research
- Funder requirements and journal policies mandating data sharing
The International Neuroinformatics Coordinating Facility (INCF) has played a crucial role in promoting FAIR practices in neuroscience, providing training, standards development, and infrastructure coordination.
2. The FAIR Principles Defined
The FAIR principles were formulated during a workshop in Leiden in 2014 and subsequently published by Wilkinson et al. (2016). Below is an explanation of each principle as it applies to neuroscience data:
F Findable
Data should be easily discoverable by both humans and machines:
- F1: Data are assigned globally unique and persistent identifiers (e.g., DOIs)
- F2: Data are described with rich metadata
- F3: Metadata clearly include the identifier of the data they describe
- F4: Data are registered or indexed in a searchable resource
A Accessible
Once found, data should be retrievable:
- A1: Data are retrievable by their identifier using a standardized protocol
- A1.1: Protocol is open, free, and universally implementable
- A1.2: Protocol allows for authentication and authorization when necessary
- A2: Metadata remain accessible even when data are no longer available
I Interoperable
Data should be integrated with other data and work with applications for analysis:
- I1: Data use a formal, accessible, shared, and broadly applicable language for knowledge representation
- I2: Data use vocabularies that follow FAIR principles
- I3: Data include qualified references to other data
R Reusable
Data should be well-described to enable reuse:
- R1: Data have a plurality of accurate and relevant attributes
- R1.1: Data are released with a clear and accessible data usage license
- R1.2: Data are associated with detailed provenance
- R1.3: Data meet domain-relevant community standards
3. The FAIR Partnership: Key Stakeholders
FAIR implementation in neuroscience requires collaboration among multiple stakeholders:
Key Stakeholders and Their Roles
Laboratories
- Implement FAIR data management practices
- Prepare datasets with sufficient documentation
- Utilize community standards and vocabularies
- Submit data to appropriate repositories
Repositories
- Issue and maintain persistent identifiers
- Link identifiers to rich metadata
- Provide access controls and authentication
- Support annotation with FAIR vocabularies
- Enforce community standards
- Support data versioning
- Provide clear licensing for datasets
- Ensure long-term data availability
Search Engines
- Index neuroscience data resources
- Facilitate discovery across distributed repositories
- Provide effective search functionality
- Convene working groups
- Coordinate, review, endorse, and organize standards
- Provide training and resources
- Examples include INCF and the International Brain Initiative (IBI)
4. FAIR Data Management in Laboratories
Effective FAIR implementation begins at the laboratory level. Below are key laboratory data management practices based on the FAIR principles:
FAIR Goal |
Principle |
Laboratory Practices |
References |
Findable |
Unique identifiers |
Create globally unique identifiers within the lab for all key entities (subjects, experiments, reagents). Implement a central registry or use existing systems (e.g., RRIDs for reagents and tools). |
Fouad et al. (2023) |
|
Rich metadata |
Accompany each identifier with detailed metadata (e.g., dates, experimenter, description for experiments; species/strain, age, weight for subjects). Use identifiers consistently in file names, folder names, and physical labels. |
Fouad et al. (2023) |
Accessible |
Authentication and authorization |
Create a centralized, accessible store for data and code under a lab-wide account to prevent data from being scattered or accessible only via personal accounts. |
|
Interoperable |
FAIR vocabularies |
Replace idiosyncratic naming with community standards like Common Data Elements and community-based ontologies. Create a lab-wide data dictionary where all variables are clearly defined. |
Bush et al. (2022), Fouad et al. (2023) |
Reusable |
Documentation |
Create a "Read me" file for each dataset with notes and information for reuse. |
|
|
Community Standards |
Store files in well-supported open formats. Adopt community standards within the lab, especially those required by target repositories. |
Bush et al. (2022), INCF Standards Portfolio |
|
Provenance |
Version datasets clearly and document differences. Keep a stable "version of record." Include detailed experimental protocols and computational workflows. Use dedicated tools like protocols.io, NeuroShapes, and ReproNIM. |
Kennedy et al. (2019) |
|
Licenses |
Ensure data sharing agreements are in place with all collaborators. For clinical datasets, verify that consents permit sharing of de-identified data. |
|
As noted in the NASEM workshop on "Changing the Culture on Data Management and Sharing" (Martone & Nakamura, 2022): "If you can share data with people in your lab, you are much more likely to have something worthwhile to share outside the lab."
5. Choosing a Repository
Selecting an appropriate repository is critical for ensuring long-term FAIR compliance. The neuroscience repository landscape is organized primarily by data type, with additional specialization by domain or region.
Types of Neuroscience Repositories
Repositories generally fall into these categories:
- Specialized by data type (e.g., neuroimaging, electrophysiology)
- Domain-specific (e.g., SPARC for peripheral nervous system)
- Region or institution-specific (e.g., CONP, BrainCode, Donders Repository)
- Generalist repositories that span disciplines (e.g., FigShare, Zenodo)
KiltHub, CMU's Institutional Repository is an instance of FigShare, and aligns with FAIR principles.
Selection Criteria
When selecting a repository, consider:
- Alignment with data type and domain
- Tool support and integration
- Curation services
- Support for data citation
- License options
- Data size limitations
- Help with data management plans
- Cost structure
- Long-term sustainability plan
Repository finder tools include:
- INCF Infrastructure Catalog
- NITRC for neuroimaging repositories
- re3data catalog
- NLM listing of repositories
- NIF listing of BRAIN Initiative Repositories
Repository FAIR Implementation
Table below compares FAIR features across major neuroscience repositories:
Principle |
Function |
EBRAINS |
SPARC |
DANDI |
CONP Portal |
OpenNeuro |
F1: Globally unique identifier |
Basic core |
DOI |
DOI |
DOI |
ARK, DOI |
DOI |
F2: Rich metadata |
|
Y |
DataCite |
Y |
DATS |
Y |
A1: Retrievable by identifier |
|
Y |
Y |
Y |
Y |
Y |
A1.1: Free, open, universal retrieval protocol |
Enhanced access |
Y |
Y |
Y |
Y |
Y |
F4: Registered in a searchable resource |
|
KS, GDS |
KS, GDS |
KS, GDS |
KS |
KS, GDS |
A1.2: Authentication and authorization |
|
Y |
Y |
Y |
Y |
Y |
R1.1: Clear data usage license |
|
Y |
CC-BY |
CC-BY, CC0 |
Y |
CC0 |
R1.3: Community standards |
Use of standards |
Multiple |
SDS, MIS |
NWB, BIDS |
Y* |
BIDS |
F3: Metadata contains identifier |
|
Y |
Y |
Y |
Y |
Y |
I1: Formal knowledge representation language |
|
Y |
Y |
N |
Y |
|
R1: Plurality of relevant attributes |
Rich(er) metadata |
OpenMinds |
OpenMinds, MIS |
NWB |
DATS |
Y |
I2: FAIR vocabularies |
|
Y |
Y |
Y |
Y |
N |
I3: Qualified references to other metadata |
|
Y |
Y |
Y |
Y |
Y |
R1.2: Provenance |
Provenance and context |
|
Exp Protocol |
|
Y |
N |
Data citation |
Additional features |
Y |
Y |
Y |
Y |
Y |
Curation |
|
Y |
Y |
N |
Y |
N |
KS: INCF Knowledge Space; GDS: Google Dataset Search; DOI: Digital object identifier; NWB: NeuroData Without Borders; BIDS: Brain Imaging Data Structure; DATS: Data tag suite
6. Standards in Neuroscience
Standards are critical for ensuring interoperability and reusability of neuroscience data. Major standards include:
Data Format Standards
- Brain Imaging Data Structure (BIDS): Standard for organizing and describing neuroimaging data
- NeuroData Without Borders (NWB): Standard for neurophysiology data
- 3D-MMS: Metadata standard for 3D microscopy
- SPARC Data Structure (SDS): For peripheral nervous system data
Common Coordinate Frameworks (CCFs)
- Waxholm Space: Registration standard for mouse and rat brain data
- Allen Institute CCF v3: Widely used for mouse brain data across international projects
Vocabularies and Ontologies
- UBERON: Cross-species anatomy ontology
- Brain Standards Data Ontology: Model for data-driven definitions of taxonomic classes
- Cell Cards: Tool for exploring BICCN taxonomic cell types
Standards Adoption
Repositories play a key role in standards adoption by:
- Supporting or requiring standards for data submission
- Providing tools to facilitate standards compliance
- Contributing to standards development
The INCF has implemented an open community review and endorsement process for neuroscience standards and maintains a searchable Standards and Best Practices Portfolio.
7. Neuroscience Infrastructure for FAIR
Neuroscience is served by a globally distributed ecosystem of data repositories and computational platforms that work together to facilitate FAIR data practices.
Search and Discovery Infrastructure
- INCF Knowledge Space: Searches across 16 major neuroscience databases
- Google Dataset Search: Many repositories now mark up content with schema.org for discovery
- Knowledge Graphs: Used by EBRAINS, CONP, and SPARC to link neuroscience concepts
Data Federation Approaches
- Canadian Open Neuroscience Portal: Allows search across multiple repositories with unified metadata (DATS standard)
- Data access tools: DataLad for download, Boutiques for containerized workflows
Sustainability Strategies
- Repository exit plans: Clear persistence policies
- Institutional partnerships: Repositories partnering with university libraries
- Infrastructure reuse: Sharing infrastructure components across projects (e.g., Brainlife and NEMAR using OpenNeuro)
8. IRB Considerations and Ethical Aspects
Implementing FAIR principles in neuroscience and psychology must be balanced with ethical considerations, particularly for human subjects research. This requires careful navigation of Institutional Review Board (IRB) requirements and data privacy regulations.
Contact CMU's Institutional Review Board if your research involves human subject data.
IRB and Consent Considerations
Prospective Planning for Data Sharing
- Informed Consent: Design consent forms that explicitly address data sharing intentions
- Tiered Consent Options: Consider providing options for different levels of data sharing (e.g., within institution, with specific collaborators, or public repositories)
- Future Use Language: Include language about potential future research uses of the data
- Re-contact Options: When appropriate, include provisions for re-contacting subjects for additional permissions
Retrospective Data Sharing
- Waiver of Consent: Understand when IRBs might grant waivers for sharing previously collected data
- Data Use Agreements: Utilize institutional agreements to enable sharing of sensitive data
- Secondary Use Determinations: Consult with IRB about requirements for secondary use of existing datasets
De-identification Standards and Practices
Proper de-identification is critical for sharing neuroscience data while protecting subject privacy:
Data Type |
De-identification Considerations |
Best Practices |
Neuroimaging Data |
Facial features may allow re-identification |
Use established defacing algorithms (e.g., pydeface, mri_deface); verify results visually |
Genetic Data |
Potentially identifiable even when "anonymized" |
Consider controlled access models; use data use agreements |
Demographics/Phenotypic |
Unique combinations of variables can lead to re-identification |
Remove or bin variables that could allow re-identification; keep only necessary variables |
Video/Audio Recordings |
Direct identifiers |
Blur faces, alter voices; consider secure processing with extracted features only |
Special Populations Considerations
Extra protections are required for vulnerable populations:
- Children: Additional parental/guardian consent requirements
- Clinical Populations: Sensitivity to stigmatization concerns
- Small or Unique Cohorts: Higher risk of re-identification requiring additional safeguards
- Indigenous Communities: Respect for data sovereignty and community-level permissions
Regulatory Frameworks
Various regulations affect FAIR implementation across different jurisdictions:
- GDPR (European Union): Emphasizes data minimization and purpose limitation principles
- HIPAA (United States): Establishes standards for Protected Health Information (PHI)
- 21st Century Cures Act (United States): Promotes data sharing while maintaining privacy
- National/Regional Regulations: May impose additional requirements or limitations
Balancing FAIR with Privacy
Strategies to maximize both FAIR principles and privacy protection:
- Controlled Access Models: Apply different access levels based on data sensitivity
- Safe Havens/Secure Environments: Allow analysis without direct data access
- Privacy-Preserving Analysis Methods: Consider federated learning approaches
- Metadata-Only Sharing: Share rich metadata while restricting access to raw data
- Synthetic Data Generation: Create artificial datasets that preserve statistical properties
IRB Engagement Strategies
Proactive approaches for working with IRBs on FAIR data practices:
- Education: Provide IRB members with information about FAIR principles and their benefits
- Template Language: Develop institutionally approved language for consent forms
- Data Management Plans: Share comprehensive plans addressing both FAIR and privacy
- Staged Approach: Propose gradual implementation with appropriate safeguards
- Collaborative Development: Involve IRB members in developing institutional policies
International Collaboration Considerations
For multinational projects, additional considerations apply:
- Harmonization of Requirements: Address different regulatory frameworks
- Data Localization Laws: Be aware of restrictions on cross-border data transfers
- Regional Standards: Adapt approaches to meet local ethical requirements
9. Implementation Challenges and Solutions
Despite progress, several challenges remain for achieving fully FAIR neuroscience:
Findability Challenges
- Problem: No comprehensive search system across all neuroscience data repositories
- Solution: Support efforts by IBI, INCF, and CONP to create federated search systems
Accessibility Challenges
- Problem: Multimodal datasets require submission to multiple repositories
- Solution: Develop unified workflows and centralized registries; implement ORCID across repositories
Interoperability Challenges
- Problem: Different repository interfaces and access policies
- Solution: Adopt "coopetition" model where repositories compete on innovation but share core functionality
Reusability Challenges
- Problem: Variable data quality and documentation across repositories
- Solution: Support curation services and improve customer service to help researchers comply with standards
International Data Governance
- Problem: Barriers to transferring data across national borders
- Solution: Federated approaches where data remains in place while computation is brought to the data
10. Resources and Tools
INCF Resources
Standards Resources
Tools for FAIR Implementation
- SODA Tool: Helps researchers organize and upload files according to SPARC standards
- DataLad: Distributed data management system
- Boutiques: Framework for creating and using containerized applications
- Protocols.io: Platform for sharing detailed experimental protocols
11. FAIR Implementation Checklist
This checklist provides a step-by-step guide to implementing FAIR principles throughout the research lifecycle in neuroscience and psychology.
✓ Planning Phase
- Develop a comprehensive data management plan
- Identify community standards relevant to your data types
- Plan for appropriate metadata collection at all stages
- Design informed consent for data sharing (human subjects)
- Establish file naming conventions and folder structure
- Select appropriate identifier schemes
- Plan quality control procedures
- Identify potential repositories for data sharing
- Budget time and resources for FAIR implementation
✓ Data Collection Phase
- Document all experimental protocols in detail
- Record complete metadata for subjects/samples
- Maintain detailed logs of acquisition parameters
- Track data provenance carefully
- Apply consistent file naming conventions
- Implement quality control procedures
- Validate data against applicable standards
- Update documentation as protocols evolve
✓ Data Processing Phase
- Document all preprocessing steps
- Record software versions and parameters used
- Preserve raw data alongside processed data
- Use standard data formats where available
- Validate processed data against standards
- Create machine-readable workflow descriptions
- Maintain version control for analysis code
- Track provenance between raw and derived data
✓ Analysis Phase
- Document all analysis decisions and parameters
- Use standardized analysis pipelines when available
- Comment code thoroughly
- Save intermediate analysis outputs
- Create research compendia with code and data
- Generate machine-readable metadata
- Verify reproducibility of key analyses
- Create visualizations to aid in data reuse
✓ Publication and Sharing Phase
- Select appropriate licenses for data and code
- Prepare data for repository submission
- Create comprehensive README files
- Generate rich, standards-compliant metadata
- Verify all data pass repository validation
- Ensure proper de-identification (human data)
- Create data dictionaries for complex datasets
- Obtain persistent identifiers (DOIs, RRIDs)
- Include data/code citations in manuscripts
- Link related research objects (data, code, articles)
✓ Technical FAIR Requirements
- Findable
- Assign persistent identifiers (DOIs, handles, ARKs)
- Create rich, machine-readable metadata
- Register in appropriate search engines
- Include metadata even for non-public data
- Accessible
- Store in trusted repositories
- Ensure metadata remains accessible even if data restricted
- Specify clear access conditions and processes
- Provide contact information for access requests
- Interoperable
- Use open, community standards for data formats
- Apply community ontologies and vocabularies
- Include qualified references to related datasets
- Document relationships between variables
- Reusable
- Apply clear, machine-readable licenses
- Document detailed provenance information
- Meet domain-relevant community standards
- Provide sufficient context for interpretation
- Include citation information
✓ IRB and Ethics Compliance
- Ensure consent covers intended sharing and reuse
- Verify appropriate de-identification procedures
- Determine appropriate access controls
- Comply with all applicable regulations
- Document any sharing restrictions
✓ Long-term Maintenance
- Plan for long-term data preservation
- Update metadata as standards evolve
- Monitor links to related resources
- Track dataset reuse and impact
- Establish procedures for addressing issues
This checklist can be adapted based on specific research protocols, data types, and institutional requirements. Regular review of FAIR practices should be integrated into project workflows to ensure continuous improvement in data management and sharing.
12. References
Abrams, M. B., Bjaalie, J. G., Das, S., Egan, G. F., Ghosh, S. S., Goscinski, W. J., et al. (2021). A standards Organization for Open and FAIR neuroscience: the International Neuroinformatics Coordinating Facility. Neuroinformatics, 20, 25–36.
Bush, K. A., Calvert, M. L., & Kilts, C. D. (2022). Lessons learned: a neuroimaging research Center's transition to open and reproducible science. Frontiers in Big Data, 5, 988084.
Ferguson, A. R., Nielson, J. L., Cragin, M. H., Bandrowski, A. E., & Martone, M. E. (2014). Big data from small data: data-sharing in the 'long tail' of neuroscience. Nature Neuroscience, 17, 1442–1447.
Fouad, K., Vavrek, R., Surles-Zeigler, M. C., Huie, J. R., Radabaugh, H., Gurkoff, G. G., et al. (2023). A practical guide to data management and sharing for biomedical laboratory researchers. Zenodo.
Gorgolewski, K. J., Auer, T., Calhoun, V. D., Craddock, R. C., Das, S., Duff, E. P., et al. (2016). The brain imaging data structure, a format for organizing and describing outputs of neuroimaging experiments. Scientific Data, 3, 160044.
Kennedy, D. N., Abraham, S. A., Bates, J. F., Crowley, A., Ghosh, S., Gillespie, T., et al. (2019). Everything matters: the ReproNim perspective on reproducible neuroimaging. Frontiers in Neuroinformatics, 13, 1.
Martone, M. E. (2024). The past, present and future of neuroscience data sharing: a perspective on the state of practices and infrastructure for FAIR. Frontiers in Neuroinformatics, 17, 1276407.
Martone, M. E., & Nakamura, R. (2022). Changing the culture on data management and sharing: overview and highlights from a workshop held by the National Academies of Sciences, Engineering, and Medicine. Harvard Data Science Review.
Poline, J.-B., Kennedy, D. N., Sommer, F. T., Ascoli, G. A., Van Essen, D. C., Ferguson, A. R., et al. (2022). Is Neuroscience FAIR? A Call for Collaborative Standardisation of Neuroscience Data. Neuroinformatics, 20, 507–512.
Wilkinson, M. D., Dumontier, M., Aalbersberg, I. J., Appleton, G., Axton, M., Baak, A., et al. (2016). The FAIR guiding principles for scientific data management and stewardship. Scientific Data, 3, 160018.