Skip to Main Content

Data Axle (InfoGroup) Historic Business and Residential Historical Data: Start

Access to Data

CMU Libraries houses these datasets on the Redivis platform. Follow these steps to gain access to the Data Axle datasets:

 

1. In order to access these datasets, you will first have to sign up for a Redivis account using your CMU email address.

 

2. Once you have created your Redivis account, then select "Apply for access" and fill out a form in order to access these datasets.

 

3. An administrator will review and approve or deny your access to the datasets (you will be notified via email either way).

 

4. If approved, log back into the Data Axle datasets Redivis record to access the data.

 

If you have any questions about access, please contact Ryan Splenda.

 

NOTE: use of these datasets are only for current CMU faculty, staff, and students.

Documentation

Data Summary

Data Axle (formerly Infogroup) is a data analytics marketing firm that provides marketing data on millions of businesses and consumers. 

 

CMU Libraries houses and facilitates access to two forms of Data Axle historical data:

 

1. Historic Business Data - address-level data on US businesses and other organizations. Data fields include: street address, phone number, number of employees, SIC codes, and more.

 

2. Residential Historical Data - geo-referenced data on millions of households and basic consumer profiles. Data fields include: household income, home value, years in residence and more. 

 

The files are organized in state-year format, with each file providing a snapshot of households/businesses at the end of each calendar year. The data can be represented as a time series from 1996-2023 (for business records) and 2006-2023 (for consumer records).

 

Timeframe Extent: Time series, annual. 1996-2023 (for Historic Business), 2006-2023 (for Residential Historical).

 

Geographic Extent: United States of America

 

Unit of Analysis: Households (for Historical Residential), businesses (for Historic Business)

 

Analyzing Data in Redivis

The Data Axle datasets are quite large in size (~1TB). Analysis of the data can be performed within the Redivis platform, but in order to do this with the best functionality, it is recommended and necessary to work with sub-datasets that are 1GB or less. Breaking down or aggregating Data Axle data within Redivis is encouraged before doing any analysis. Here are two options to consider after data aggregation:

 

1. Data analysis in the Redivis platform using Jupyter notebooks

 

2. Stream data directly into Python for analysis. 

 

Due to these limitations, we encourage interested users to consult with us by reaching out to both:

 

Ryan Splenda, Business & Economics Librarian

Kristen Scotti, Open Science Postdoctoral Associate

Business & Economics Librarian

Profile Photo
Ryan Splenda
Contact:
109C Hunt Library
Carnegie Mellon University Libraries
4909 Frew Street
Pittsburgh, PA 15213
(412) 268-2453

Open Science Postdoc

Profile Photo
Kristen Scotti
Contact:
4416 Sorrells Library
Carnegie Mellon University Libraries
Wean Hall
Hamerschlag Dr
Pittsburgh, PA 15213