CMU Libraries houses these datasets on the Redivis platform. Follow these steps to gain access to the Data Axle datasets:
1. In order to access these datasets, you will first have to sign up for a Redivis account using your CMU email address.
2. Once you have created your Redivis account, then select "Apply for access" and fill out a form in order to access these datasets.
3. An administrator will review and approve or deny your access to the datasets (you will be notified via email either way).
4. If approved, log back into the Data Axle datasets Redivis record to access the data.
If you have any questions about access, please contact Ryan Splenda.
NOTE: use of these datasets are only for current CMU faculty, staff, and students.
Data Axle (formerly Infogroup) is a data analytics marketing firm that provides marketing data on millions of businesses and consumers.
CMU Libraries houses and facilitates access to two forms of Data Axle historical data:
1. Historic Business Data - address-level data on US businesses and other organizations. Data fields include: street address, phone number, number of employees, SIC codes, and more.
2. Residential Historical Data - geo-referenced data on millions of households and basic consumer profiles. Data fields include: household income, home value, years in residence and more.
The files are organized in state-year format, with each file providing a snapshot of households/businesses at the end of each calendar year. The data can be represented as a time series from 1996-2023 (for business records) and 2006-2023 (for consumer records).
Timeframe Extent: Time series, annual. 1996-2023 (for Historic Business), 2006-2023 (for Residential Historical).
Geographic Extent: United States of America
Unit of Analysis: Households (for Historical Residential), businesses (for Historic Business)
The Data Axle datasets are quite large in size (~1TB). Analysis of the data can be performed within the Redivis platform, but in order to do this with the best functionality, it is recommended and necessary to work with sub-datasets that are 1GB or less. Breaking down or aggregating Data Axle data within Redivis is encouraged before doing any analysis. Here are two options to consider after data aggregation:
1. Data analysis in the Redivis platform using Jupyter notebooks.
2. Stream data directly into Python for analysis.
Due to these limitations, we encourage interested users to consult with us by reaching out to both:
Ryan Splenda, Business & Economics Librarian
Kristen Scotti, Open Science Postdoctoral Associate