Skip to content

directory for ruxton.ai developed code for imputing race & ethnicity using name/geo/other data

Notifications You must be signed in to change notification settings

tleitch/raceImpute

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Custom raceBERT

Preparation

  1. make sure the folder structure remains the same

    a. data in the data folder
    b. .py files in the models folder
    c. trained_models exist to store the trained models

root  
|-- data/  
|-- models/  
|-- trainied_models/  
  1. make sure the data in the data folder follows the same structure as nmzpAgeSexFL.csv, or edit preprocess_data.py to produce the right .parquet files for training

  2. make sure you have connection to transformer hub to download raceBERT/BERT models, or edit raceBERT_train.py to load models from local

Run

run the following commands in the root folder

python -m models.preprocess_data data/nmzpAgeSexFL.csv
python -m models.raceBERT_train raceBERT florida_5label

Details

preprocess_data.py

  1. python -m models.preprocess_data [data_file]
  2. if using different data, need to follow the column structure in nmzpAgeSexFL.csv

raceBERT_train.py

  1. python -m models.raceBERT_train [model_label] [data_label]
  2. check raceBERT_train.py for existing model_label and data_label, or edit raceBERT_train.py for custom options

About

directory for ruxton.ai developed code for imputing race & ethnicity using name/geo/other data

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •