During the COVID-19 outbreak, many researchers and healthcard professionals are rapidly publishing their findings that help to understand the mechanism and epidemiology of insights and solutions for the COVID-19 pandemic. Below are a few high quality collections. and offering
COVID-19 Open Research Dataset (CORD-19): A machine readable, free resource developed by the Allen Institute for AI. Contains over 44,000 scholarly articles, including over 29,000 with full text, about COVID-19 and the coronavirus family of viruses for use by the global research community. This dataset is intended to mobilize researchers to apply recent advances in natural language processing to generate new insights in support of the fight against this infectious disease. The corpus will be updated weekly as new research is published in peer-reviewed publications and archival services like bioRxiv, medRxiv, and others.
WHO global research and publications data base on COVID-19: Latest scientific findings and knowledge on coronavirus disease (COVID-19), together with a searchable WHO database of publications.
LitCovid: A curated literature hub for tracking up-to-date scientific information about the 2019 novel Coronavirus. Currently contains a collection of more than 1,200 journal articles hosted by the National LIbrary of Medicine.
Johns Hopkins University COVID-19 data: The data repository for the COVID-19 Dashboard operated by the Johns Hopkins University Center for Systems Science and Engineering (JHU CSSE). Also, Supported by ESRI Living Atlas Team and the Johns Hopkins University Applied Physics Lab (JHU APL). Aggregated from many data sources including WHO, CDC, WorldoMeters, New York Times, and much more.
R3data.org (Registry of Research Data Repositories): A global registry of research data repositories from all academic disciplines. It provides an overview of existing research data repositories in order to help researchers to identify a suitable repository for their data.
FAIRsharing.org: A curated, informative and educational resource on data and metadata standards, databases, policies, and collections. Contains many collections in biomedical field.
NYU Data Catalog: An open repository maintained by NYU medical school. It includes datasets generated by NYU researchers as well as publically available and licensed datasets that are generated at external organizations, e.g. the Bureau of Labor Statistics.
UCI Machine Learning Repository: A collection of databases, domain theories, and data generators that are used by the machine learning community for the empirical analysis of machine learning algorithms. It has been widely used by students, educators, and researchers all over the world as a primary source of machine learning data sets.
WordNet: A large lexical database of English. Nouns, verbs, adjectives and adverbs are grouped into sets of cognitive synonyms (synsets), each expressing a distinct concept.
ImageNet: An image database organized according to the WordNet hierarchy (currently only the nouns), in which each node of the hierarchy is depicted by hundreds and thousands of images.
Open Data for Deep Learning: Maintained by a model deployment platform, Skymind. Has a collections of open datasets.
StateOfTheArt.ai: An entirely community driven website for tasks, datasets, metrics, or results.
Papers With Code: The mission of Papers With Code is to create a free and open resource with Machine Learning papers, code and evaluation tables.
NLP-progress: Repository to track the progress in Natural Language Processing (NLP), including the datasets and the current state-of-the-art for the most common NLP tasks.
Nature Scientific Data has a very good list of recommended subject-specific repositories.
LearnSphere: Integrates existing and new educational data and analysis repositories to offer the world's largest learning analytics infrastructure with methods, linked data, and portal access to relevant resources.
DataShop: A data repository and web application for learning science researchers. It provides secure data storage as well as an array of analysis and visualization tools available through a web-based interface.
United Nations Data Catalog: A comprehensive and representative overview of UN system open data assets.
Data.gov: US government's open data, tools, and resources.
OpenEI: Maintained by CKAN. Includes industry open data.
WorldData.AI: A searchable digital platform that provides access to 3.3 Billion curated datasets across macroeconomics, trade, labour statistics, financial markets, weather, health, and demographics. Free for academics. Here is a short article that teaches you how to use.
PMC Text Mining Collections: PubMed Central (PMC) maintains several large subsets or collections of articles where files for text mining and other purposes are made available under Creative Commons or similar licenses that generally allow more liberal redistribution and reuse than a traditional copyrighted work.