-
make sure the folder structure remains the same
a. data in the data folder
b..pyfiles in the models folder
c. trained_models exist to store the trained models
root
|-- data/
|-- models/
|-- trainied_models/
-
make sure the data in the data folder follows the same structure as
nmzpAgeSexFL.csv, or editpreprocess_data.pyto produce the right.parquetfiles for training -
make sure you have connection to transformer hub to download raceBERT/BERT models, or edit
raceBERT_train.pyto load models from local
run the following commands in the root folder
python -m models.preprocess_data data/nmzpAgeSexFL.csv
python -m models.raceBERT_train raceBERT florida_5label
preprocess_data.py
python -m models.preprocess_data [data_file]- if using different data, need to follow the column structure in
nmzpAgeSexFL.csv
raceBERT_train.py
python -m models.raceBERT_train [model_label] [data_label]- check
raceBERT_train.pyfor existingmodel_labelanddata_label, or editraceBERT_train.pyfor custom options