Skip to content

uiuc-bdeep/InfoUSA_Database

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

20 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

InfoUSA_Database Overview

Basic pipeline:

                                       store_infousa.py                      BDEEPInfousa R package               
Household_Ethnicity_<year>.txt    --------------------------->   Postgres   ----------------------->     R      ------>    Further   
       (Raw TXT files)            --------------------------->   Database   ----------------------->    Data    ------>    Processing ...

TXT -> Postgres Database

store_infousa.py converts InfoUSA raw data from txt file to postgresql database. The script takes year number as an argument. For example, if you want to store year 2006, execute the following in the database machine:

python3 store_infousa.py 2006

The script uses sqlalchemy (Reference here) to create and insert into the database table. Different from the Zillow data, the InfoUSA data can be converted into a pandas data frame. Therefore, one can insert into the database by chunks, achieving better performance. Note that variable DTYPEIN is the type read by pandas, while variable DTYPE is that read by the database engine. These two must be consistent.

Postgres Database -> R

To transfer data from database into rds files, we use the BDEEPInfousa R package.

This package sets up a direct connection to the database and gets the data. The type reference table is also available. Details in the package folder.

An Example using database: Race Prediction Analysis

The InfoUSA data predicts the ethnicity of each of the recorded names and stores them as a separate column. The information is important for some researchers in the field of cultural differences and discrimination. Here, we analyzed the consistency of the InfoUSA prediction with that by another commonly used method, the R WRU package. See the folder for more details.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •