The FAIR principles stand for Findable, Accessible, Interoperable, and Reusable. Originally developed for research data, these principles have been adapted specifically for computational workflows to maximize their value as research assets and facilitate their adoption by the wider research community.
Software with two main characteristics:
The formal specification of data flow and execution control between executable components, expected datasets, and parameter files.
The instantiation of the workflow with inputs (parameters, input datasets) and outputs (output data, provenance execution log, lineage of data products).
Software that handles data flow and/or execution control, abstracting the workflow from underlying digital infrastructure (examples: Nextflow, Galaxy, Snakemake, Parsl).
What it means: Your workflow needs a permanent, unique "address" on the internet
How to implement:
What it means: Each part of your workflow (scripts, tools, sub-workflows) needs its own identifier
How to implement:
What it means: Each version of your workflow gets a unique identifier
How to implement:
What it means: Comprehensive information about your workflow's purpose, requirements, and usage
How to implement:
What it means: The description clearly states which workflow it describes
How to implement:
What it means: Your workflow can be found through search engines and registries
How to implement:
What it means: Anyone can download your workflow using standard web protocols
How to implement:
What it means: No special software needed to access your workflow
How to implement:
What it means: If access restrictions are needed, use standard authentication
How to implement:
What it means: Description remains available even if workflow can't be run
How to implement:
What it means: Use standard formats that both humans and computers can understand
How to implement:
What it means: Use standardized terms and classifications
How to implement:
What it means: Your workflow uses standard file formats and data structures
How to implement:
What it means: Clear links to external tools, datasets, and related workflows
How to implement:
What it means: Complete documentation enabling others to understand and use your workflow
How to implement:
What it means: Legal terms for using and modifying your workflow are explicit
How to implement:
What it means: All parts of your workflow have explicit licensing
How to implement:
What it means: Clear history of workflow development and data lineage
How to implement:
What it means: Clear connections to related workflows and dependencies
How to implement:
What it means: Follow best practices specific to your research field
How to implement:
Computational workflows exist on a spectrum of complexity, and the implementation of FAIR principles can vary depending on the scale and sophistication of your workflow.
Below is a table that illustrates common features for simple and complex workflows:
Aspect | Simple Workflows | Complex Workflows |
---|---|---|
Scale | < 1GB data, < 10 steps | > 1GB data, 10+ steps |
Time Investment | Hours to days | Weeks to months |
FAIR Complexity | Basic implementation | Full implementation |
Primary Tools | Git, GitHub, basic docs | WMS, registries, containers |
Metadata | README files, comments | Structured schemas, ontologies |
Testing | Manual testing | Automated CI/CD |
Deployment | Local execution | Multi-platform deployment |
Maintenance | Occasional updates | Ongoing maintenance |
Solution: Use containerization (Docker, Singularity) to package dependencies
Solution: Use data repositories (Zenodo, domain-specific archives) and reference by DOI
Solution: Use workflow management systems that abstract execution environment
Solution: Pin specific versions and use containers for reproducibility
Solution: Start with minimal documentation and improve iteratively
FAIR Aspect | Simple Workflow Approach | Complex Workflow Approach |
---|---|---|
F1 - Identifiers | GitHub releases, optional Zenodo DOI | Workflow registry DOI, component DOIs |
F2 - Metadata | README files, inline comments | Structured metadata, ontologies |
A1 - Access | Direct GitHub download | Container images, registry access |
I1 - Standards | Standard file formats | Workflow languages (CWL, WDL) |
R1 - Documentation | README + examples | Comprehensive docs + tutorials |
R1.3 - Provenance | Git history, manual logs | Automated provenance tracking |
Time investment: 2-4 hours
Immediate benefits: Easier sharing, version control, basic reproducibility
Next steps: Add example data, create GitHub release, consider Zenodo DOI
Start with simple workflow practices, then gradually adopt WMS features:
FAIR is a journey, not a destination. Start with what you can implement today and improve over time!