Source code for GovStat.us website.
Webserver is implemented using Flask on Python 3.10.
Install locally using pip install .
- python
- unitedstates/congress
- git-lfs
- A MySQL server
- e.g. MariaDB
Python dependencies will be pulled in automatically by pip.
Some notes on where the data for this webapp comes from. Congress data on bills and votes comes from scrapers in the unitedstates/congress repo. Budget data comes from excel files published by the White House Office of Management and Budget (OMB).
To obtain congress data, do the following:
From the root of this repo, run:
usc-run votes --congress=XXX --session=YYYY --force=True --fast=True
usc-run govinfo --bulkdata=BILLSTATUS --congress=XXX
usc-run billswhere XXX is the Congress number, and YYYY is the session number.
For example,
usc-run votes --congress=117 --session=2022 --force=True --fast=True
usc-run govinfo --bulkdata=BILLSTATUS --congress=117
usc-run billsBudget data is carried in this repo via git-lfs.
Simple start for MariaDB:
sudo mariadb-install-db --user=mysql --basedir=/usr --datadir=/var/lib/mysql
sudo systemctl start mariadb.servicecp app/cfg/config.sample.json app/cfg/config.jsonand edit in the appropriate values to app/cfg/config.json.
flask db init
flask db migrate -m "initial migration"
flask db upgradeAfter creating the flask MySQL DB run the following commands to populate it:
python vote_loader.py
python bill_loader.py
python budget_loader.pyTo launch the webapp:
gunicorn -b localhost:5000 -w 4 govstat:app- Host Name (
localhost) - Port Number (
5000) - Number of Threads/Handlers (
4) - Flask app and entrypoint (
govstat:app)
See above to run gunicorn.
Specify NGINX port permissions, and forwarding for HTTP and HTTPS requests at /etc/nginx/sites-enabled/
Configure supervisor to run gunicorn app at /etc/supervisor/conf.d/
Create SSL certificates
congress/
+-- govstat/
+-- app/
+-- Bills.py
+-- Budget.py
+-- config.py
+-- __init__.py [App instantiation, database instantiation, import functions for data loading and retrieval.]
+-- models.py
+-- routes.py
+-- Votes.py
+-- static/
+-- templates/
+-- govstat.py
+-- setup.py
+-- bill_loader.py
+-- vote_loader.py
+-- data/
+-- 116/
+-- amendments/
+-- hamdt/ [House Amendments]
+-- hamdtN/
+-- [JSON and XML files]
+-- samdt/ [Senate Amendments]
+-- samdtN/
+-- [JSON and XML files]
+-- bills/
+-- hconres/
+-- hconresN/
+-- [XML files. After processing, JSON files]
+-- hjres/
+-- hjresN/
+-- [XML files. After processing, JSON files]
+-- hr
+-- hrN/
+-- [XML files. After processing, JSON files]
+-- hres/
+-- hresN/
+-- [XML files. After processing, JSON files]
+-- s/
+-- sN/
+-- [XML files. After processing, JSON files]
+-- sconres/
+-- sconresN/
+-- [XML files. After processing, JSON files]
+-- sjres/
+-- sjresN/
+-- [XML files. After processing, JSON files]
+-- sres/
+-- sresN/
+-- [XML files. After processing, JSON files]
+-- votes/
+-- 2020/
+-- hN/
+-- [JSON and XML files]
+-- sN/
+-- [JSON and XML files]
+-- 2021/ [One directory per year]
+-- 117/ ... [One directory per congress session number]
+-- hist_fy21/ [Historical data through 2021 from Office of Management and Budget (OMB)]
+-- [51 XLSX files containing data].
+-- supplemental/
+-- [XLSX files containing supplemental budget data]
+-- upcoming_house_floor/
+-- [JSON files per week containing bill activities that week]
+-- tasks/
+-- [PY files for each type of data that can be scraped and delivered]
+-- [amendments, bills, committees, govinfo, nominations, votes, upcoming, etc.]
+-- scripts/
+-- [SH scripts to transform raw JSON and XML data into forms usable for govtrack and other utilities.]
+-- cache/
+-- test/
+-- [Test scripts, not exhaustive]
+-- contrib/

