Skip to Main Content
It looks like you're using Internet Explorer 11 or older. This website works best with modern browsers such as the latest versions of Chrome, Firefox, Safari, and Edge. If you continue with this browser, you may see unexpected results.

Data 101: Finding Data

In this LibGuide, we introduce you to the wide world of data, including data types (qualitative, quantitative, ethnographic, geospatial, etc.), finding data, visualizing data, and managing data.

                                                                    Image Description: Person using a laptop to interact with a dataset on a laptop.

When would you reuse data?

There are many reasons why you may want to use existing data. Below are some examples of common reasons to reuse data: 

  • You need data collected by another agency, such as the U.S. Census Bureau or the United Nations Statistics Division
  • You want to supplement your own collected data with historical data on the same topic
  • You are hoping to replicate the results of a scientific study by re-analyzing their open data
  • You want to blend data from several sources to produce a holistic analysis of a topic

 

We are incredibly lucky to live during a time when the amount of available digital data is skyrocketing! Using existing data for research projects can help save time and money, and supports innovation within the scholarly community. For more information on the benefits of reusing data, please visit this resource: 

https://okfn.org/opendata/why-open-data/

Citing Data

When citing data which was gathered by another researcher or organization, it is important to appropriately cite where you obtained the data to give them proper credit. In general, your citation should include information on: 

  • Creator of the data
  • Name of the dataset
  • Year of data publication
  • Where the data is housed
  • Version (is the dataset a numbered version?)
  • Applicable access information, such as a DOI or URL

 

DataCite offers a recommended format for data citation at the following website: https://datacite.org/cite-your-data.html

Where can you search for data?

Open data can be found in a variety of places, and knowing where to look can feel a bit daunting! Below, we've included some links to other LibGuides and external resources to help you find the data you need: 

CMU LibGuides 

Business and Economic Datasets: https://guides.library.cmu.edu/datasets 

Finding & Using Social Science Data from ICPSR: https://guides.library.cmu.edu/ICPSR 

Find Data for Text Mining: https://guides.library.cmu.edu/c.php?g=827727&p=5916796 

Open Data Repositories 

figshare: http://figshare.com

Zenodo: https://zenodo.org

re3data, a registry of research data repositories: https://www.re3data.org

Search Engines for Data

DataONE Search: a platform for finding open data on environmental and earth science topics from across the world, with descriptive metadata for each available dataset. 

What should you think about when finding data?

When looking for data, ask yourself the following questions: 

  1. Is the source that is providing the data considering reputable?
  2. Is there appropriate metadata for the data that will help me know how it was collected?
  3. Was the data ethically collected?
  4. Does the data have any restrictions on reuse?

 

Navigating whether you can reuse data in your research can be a tricky process. Luckily, CMU Libraries can help you! Feel free to reach out to our Research Data Management Consultant Hannah Gunderman at hgunderm@andrew.cmu.edu for help!

Local Experts at CMU Libraries

Did you know that CMU Libraries has experts in each academic area that can help you find data for your research needs?

Check out our "Find a Subject Expert" page for a librarian expert in your area!

Psychology and Social & Decision Sciences Liaison Librarian

Data Collaborations Lab

undefined

 

The CMU Libraries Data Collaboration Lab (DataCoLAB) connects the research community across disciplinary borders, and facilitates collaborations between data producers and data scientists. The program connects researchers who want more from their datasets with individuals who have data and computer science skills, creating opportunities for people with different technical and disciplinary backgrounds to work together.

Want to learn more or ask questions? Email dataCoLAB@andrew.cmu.edu.

Credits and Acknowledgements

Banner image courtesy of Pexels, under free use licensing (https://www.pexels.com/photo/person-s-hand-on-top-of-laptop-while-working-938963/). Design made in Canva.