Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 8 additions & 4 deletions README.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -172,7 +172,7 @@ There are some minor differences between PTAXSIM and the real bill. The taxing d
We can also look at a single property over multiple years, in this case broken out by taxing district. To do so, pass a vector of multiple years to the `year_vec` argument of `tax_bill()`:

```{r multi_year_1, message=FALSE, warning=FALSE}
multiple_years <- tax_bill(2010:2023, "14081020210000")
multiple_years <- tax_bill(2010:2024, "14081020210000")
multiple_years
```

Expand Down Expand Up @@ -217,10 +217,10 @@ multiple_years_plot <- ggplot(data = multiple_years_summ) +
scale_y_continuous(
name = "Total Tax Amount",
labels = scales::dollar,
expand = c(0, 0),
expand = expansion(mult = c(0, .05)),
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the dramatic 2024 increase I thought the axis spacing was making it look as though it was cut off, so I added some more buffer here.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Question, non-blocking] It's a bit hard to tell since the Y-axes are not aligned, but it looks as if the increment value is now larger between 2016 and 2023 than it was in the previous version of the chart, even setting aside the big jump in 2024 (which is also pretty shocking, so much so that we should fact check it). Do you see that too? If so, any idea why might be the case?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see that too when looking at the diff - this .png was generated when I knitted the file but when I am running that code chunk alone it appears correct and aligned with the old viz. Weird, not sure why that is happening!
Here is image I get when I just ran it:
image

And as for the dramatic spike, this PIN's AV spiked in 2024, going from 7000 to 12500. Perhaps we should select a PIN with more stable AV?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this .png was generated when I knitted the file but when I am running that code chunk alone it appears correct and aligned with the old viz. Weird, not sure why that is happening!

That's so strange! Let me know if you want a second eye debugging it. I get the correct and more aligned version when I run it and knit it locally too, although I have to tweak the two DBI::dbConnect() paths to point to the correct database -- I wonder if it's possible that you have two different versions of the database, one of which gets loaded instead of the other when knitting vs. running locally? Very low confidence in that diagnosis, it's just an idea as to the kind of thing that might be going wrong. In any case, I think we should make sure we have the correct / aligned version of the chart before we merge.

And as for the dramatic spike, this PIN's AV spiked in 2024, going from 7000 to 12500. Perhaps we should select a PIN with more stable AV?

Got it, that makes sense! I don't mind keeping it as is, it's kind of interesting to see such a dramatic spike in 2024.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I have addressed the problem. When knitting the doc it was loading the existing version of the ptaxsim package rather than the dev version which meant TIF amounts were not being calculated correctly. I knit the doc with it loading the dev version and pushed the updated viz .png and .md files but I'm not 100% sure that was the right way to do it

n.breaks = 8
) +
scale_x_continuous(name = "Year", n.breaks = 7) +
scale_x_continuous(name = "Year", n.breaks = 8) +
scale_fill_manual(values = scales::hue_pal()(10)) +
theme_minimal() +
guides(fill = guide_legend(title = "District Type"))
Expand Down Expand Up @@ -253,6 +253,8 @@ The PTAXSIM backend database contains cleaned data from the Cook County Clerk, T
| tif | Clerk | [TIF Reports - Cook County Summary Reports](https://www.cookcountyclerkil.gov/property-taxes/tifs-tax-increment-financing/tif-reports) | [data-raw/tif/tif.R](data-raw/tif/tif.R) | TIF revenue, start year, and cancellation year |
| tif_crosswalk | Clerk | Manually created from TIF summary and distribution reports | [data-raw/tif/tif.R](data-raw/tif/tif.R) | Fix for data issue identified in #39 |
| tif_distribution | Clerk | [TIF Reports - Tax Increment Agency Distribution Reports](https://www.cookcountyclerkil.gov/property-taxes/tifs-tax-increment-financing/tif-reports) | [data-raw/tif/tif.R](data-raw/tif/tif.R) | TIF EAV, frozen EAV, and distribution percentage by tax code |
| pin_tif_distribution | Clerk | [TIF Reports - Tax Increment Agency Distribution Reports](https://www.cookcountyclerkil.gov/property-taxes/tifs-tax-increment-financing/tif-reports) | [data-raw/tif/tif.R](data-raw/tif/tif.R) | TIF EAV, frozen EAV, and distribution percentage by PIN |


### Database diagram

Expand All @@ -264,7 +266,9 @@ The PTAXSIM backend database contains cleaned data from the Cook County Clerk, T

## Notes and caveats

- Currently, the per-district tax calculations for properties in the Red-Purple Modernization (RPM) TIF are slightly flawed. However, the total tax bill per PIN is still accurate. See issue [#4](https://github.com/ccao-data/ptaxsim/issues/4) for more information or issue [#56](https://github.com/ccao-data/ptaxsim/issues/56).
Comment thread
jeancochrane marked this conversation as resolved.

- PTAXSIM's tax year 2024 update required significant changes to the database and package. Please see the PTAXSIM [changelog](https://ccao-data.github.io/ptaxsim/news) for more details.
- The per-district tax calculation using `tax_bill(simplify = TRUE)` for properties in transit TIFs do not match the amounts that the Treasurer reports on their tax bills. We believe the amounts we report are correct, however. See issues [#4](https://github.com/ccao-data/ptaxsim/issues/4) and [#56](https://github.com/ccao-data/ptaxsim/issues/56) for more information, as well as PR [#58](https://github.com/ccao-data/ptaxsim/pull/58).
- Special Service Area (SSA) rates must be calculated manually when creating counterfactual bills. See issue [#3](https://github.com/ccao-data/ptaxsim/issues/3) for more information.
- In rare instances, a TIF can have multiple `agency_num` identifiers (usually there's only one per TIF). The `tif_crosswalk` table determines what the "main" `agency_num` is for each TIF and pulls the name and TIF information using that identifier. See issue [GitLab #39](https://gitlab.com/ccao-data-science---modeling/packages/ptaxsim/-/issues/39) for more information.
- PTAXSIM is relatively memory-efficient and can calculate every district line-item for every tax bill for the last 15 years (roughly 350 million rows). However, the memory required for this calculation is substantial (around 100 GB).
Expand Down
89 changes: 48 additions & 41 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,8 +36,8 @@ Table of Contents
> installation](#database-installation) for details.
>
> [**Link to PTAXSIM
> database**](https://ccao-data-public-us-east-1.s3.amazonaws.com/ptaxsim/ptaxsim-2024.0.0-alpha.1.db.bz2)
> (DB version: 2024.0.0; Last updated: 2026-03-18 20:38:53)
> database**](https://ccao-data-public-us-east-1.s3.amazonaws.com/ptaxsim/ptaxsim-2024.0.0-alpha.2.db.bz2)
> (DB version: 2024.0.0; Last updated: 2026-04-14 22:42:59)

PTAXSIM is an R package/database to approximate Cook County property tax
bills. It uses real assessment, exemption, TIF, and levy data to
Expand Down Expand Up @@ -172,7 +172,7 @@ database:

1. Download the compressed database file from the CCAO’s public S3
bucket. [Link
here](https://ccao-data-public-us-east-1.s3.amazonaws.com/ptaxsim/ptaxsim-2024.0.0-alpha.1.db.bz2).
here](https://ccao-data-public-us-east-1.s3.amazonaws.com/ptaxsim/ptaxsim-2024.0.0-alpha.2.db.bz2).
2. (Optional) Rename the downloaded database file by removing the
version number, i.e. ptaxsim-2024.0.0.db.bz2 becomes
`ptaxsim.db.bz2`.
Expand Down Expand Up @@ -538,22 +538,22 @@ broken out by taxing district. To do so, pass a vector of multiple years
to the `year_vec` argument of `tax_bill()`:

``` r
multiple_years <- tax_bill(2010:2023, "14081020210000")
multiple_years <- tax_bill(2010:2024, "14081020210000")
multiple_years
#> Key: <year, pin, agency_num>
#> year pin class tax_code av eav agency_num
#> <int> <char> <char> <char> <int> <int> <char>
#> 1: 2010 14081020210000 206 73001 69062 227905 010010000
#> 2: 2010 14081020210000 206 73001 69062 227905 010020000
#> 3: 2010 14081020210000 206 73001 69062 227905 030210000
#> 4: 2010 14081020210000 206 73001 69062 227905 030210001
#> 5: 2010 14081020210000 206 73001 69062 227905 030210002
#> ---
#> 152: 2023 14081020210000 206 73105 70000 211141 044060000
#> 153: 2023 14081020210000 206 73105 70000 211141 044060000
#> 154: 2023 14081020210000 206 73105 70000 211141 050200000
#> 155: 2023 14081020210000 206 73105 70000 211141 050200001
#> 156: 2023 14081020210000 206 73105 70000 211141 080180000
#> year pin class tax_code av eav agency_num
#> <int> <char> <char> <char> <int> <int> <char>
#> 1: 2010 14081020210000 206 73001 69062 227905 010010000
#> 2: 2010 14081020210000 206 73001 69062 227905 010020000
#> 3: 2010 14081020210000 206 73001 69062 227905 030210000
#> 4: 2010 14081020210000 206 73001 69062 227905 030210001
#> 5: 2010 14081020210000 206 73001 69062 227905 030210002
#> ---
#> 163: 2024 14081020210000 206 73105 125001 379441 044060000
#> 164: 2024 14081020210000 206 73105 125001 379441 044060000
#> 165: 2024 14081020210000 206 73105 125001 379441 050200000
#> 166: 2024 14081020210000 206 73105 125001 379441 050200001
#> 167: 2024 14081020210000 206 73105 125001 379441 080180000
#> agency_name agency_major_type agency_minor_type
#> <char> <char> <char>
#> 1: COUNTY OF COOK COOK COUNTY COOK
Expand All @@ -562,24 +562,24 @@ multiple_years
#> 4: CITY OF CHICAGO LIBRARY F... MUNICIPALITY/TOWNSHIP LIBRARY
#> 5: CITY OF CHICAGO SCHOOL BL... MUNICIPALITY/TOWNSHIP MISC
#> ---
#> 152: BOARD OF EDUCATION SCHOOL UNIFIED
#> 153: BOARD OF EDUCATION - from... SCHOOL UNIFIED
#> 154: CHICAGO PARK DISTRICT MISCELLANEOUS PARK
#> 155: CHICAGO PARK DISTRICT AQU... MISCELLANEOUS BOND
#> 156: METRO WATER RECLAMATION D... MISCELLANEOUS WATER
#> 163: BOARD OF EDUCATION SCHOOL UNIFIED
#> 164: BOARD OF EDUCATION - from... SCHOOL UNIFIED
#> 165: CHICAGO PARK DISTRICT MISCELLANEOUS PARK
#> 166: CHICAGO PARK DISTRICT AQU... MISCELLANEOUS BOND
#> 167: METRO WATER RECLAMATION D... MISCELLANEOUS WATER
#> agency_tax_rate final_tax
#> <num> <num>
#> 1: 0.00423 964.040
#> 2: 0.00051 116.230
#> 3: 0.00914 2083.050
#> 4: 0.00102 232.460
#> 5: 0.00116 264.370
#> 1: 0.00423000 964.04
#> 2: 0.00051000 116.23
#> 3: 0.00914000 2083.05
#> 4: 0.00102000 232.46
#> 5: 0.00116000 264.37
#> ---
#> 152: 0.03829 5411.540
#> 153: 0.00000 2673.052
#> 154: 0.00318 493.830
#> 155: 0.00000 0.000
#> 156: 0.00345 535.760
#> 163: 0.03630964 5662.50
#> 164: 0.00000000 8114.87
#> 165: 0.00294209 590.33
#> 166: 0.00000000 0.00
#> 167: 0.00340445 683.10
```

The `tax_bill()` function will automatically combine the years and PIN
Expand Down Expand Up @@ -629,18 +629,18 @@ multiple_years_plot <- ggplot(data = multiple_years_summ) +
scale_y_continuous(
name = "Total Tax Amount",
labels = scales::dollar,
expand = c(0, 0),
expand = expansion(mult = c(0, .05)),
n.breaks = 8
) +
scale_x_continuous(name = "Year", n.breaks = 7) +
scale_x_continuous(name = "Year", n.breaks = 8) +
scale_fill_manual(values = scales::hue_pal()(10)) +
theme_minimal() +
guides(fill = guide_legend(title = "District Type"))
```

</details>

<img src="man/figures/README-multi_year_4-1.png" alt="" width="85%" />
<img src="man/figures/README-multi_year_4-1.png" width="85%" />

For more advanced usage, such as counterfactual analysis, see the
[vignettes
Expand Down Expand Up @@ -669,6 +669,7 @@ data was available in mid-2020.
| tif | Clerk | [TIF Reports - Cook County Summary Reports](https://www.cookcountyclerkil.gov/property-taxes/tifs-tax-increment-financing/tif-reports) | [data-raw/tif/tif.R](data-raw/tif/tif.R) | TIF revenue, start year, and cancellation year |
| tif_crosswalk | Clerk | Manually created from TIF summary and distribution reports | [data-raw/tif/tif.R](data-raw/tif/tif.R) | Fix for data issue identified in \#39 |
| tif_distribution | Clerk | [TIF Reports - Tax Increment Agency Distribution Reports](https://www.cookcountyclerkil.gov/property-taxes/tifs-tax-increment-financing/tif-reports) | [data-raw/tif/tif.R](data-raw/tif/tif.R) | TIF EAV, frozen EAV, and distribution percentage by tax code |
| pin_tif_distribution | Clerk | [TIF Reports - Tax Increment Agency Distribution Reports](https://www.cookcountyclerkil.gov/property-taxes/tifs-tax-increment-financing/tif-reports) | [data-raw/tif/tif.R](data-raw/tif/tif.R) | TIF EAV, frozen EAV, and distribution percentage by PIN |

### Database diagram

Expand All @@ -680,12 +681,18 @@ data was available in mid-2020.

## Notes and caveats

- Currently, the per-district tax calculations for properties in the
Red-Purple Modernization (RPM) TIF are slightly flawed. However, the
total tax bill per PIN is still accurate. See issue
[\#4](https://github.com/ccao-data/ptaxsim/issues/4) for more
information or issue
[\#56](https://github.com/ccao-data/ptaxsim/issues/56).
- PTAXSIM’s tax year 2024 update required significant changes to the
database and package. Please see the PTAXSIM
[changelog](https://ccao-data.github.io/ptaxsim/news) for more
details.
- The per-district tax calculation using `tax_bill(simplify = TRUE)` for
properties in transit TIFs do not match the amounts that the Treasurer
reports on their tax bills. We believe the amounts we report are
correct, however. See issues
[\#4](https://github.com/ccao-data/ptaxsim/issues/4) and
[\#56](https://github.com/ccao-data/ptaxsim/issues/56) for more
information, as well as PR
[\#58](https://github.com/ccao-data/ptaxsim/pull/58).
- Special Service Area (SSA) rates must be calculated manually when
creating counterfactual bills. See issue
[\#3](https://github.com/ccao-data/ptaxsim/issues/3) for more
Expand Down
2 changes: 1 addition & 1 deletion data-raw/create_db.sql
Original file line number Diff line number Diff line change
Expand Up @@ -88,7 +88,7 @@ CREATE TABLE agency_fund_info (
fund_num varchar(6) NOT NULL,
fund_name varchar NOT NULL,
capped_ind boolean NOT NULL,
PRIMARY KEY (agency_num, fund_type_num, fund_num)
PRIMARY KEY (agency_num, fund_num)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Question, non-blocking] Just to clarify: I believe we will need to do a new database export to reflect this change, correct? It's not a big deal -- the most important advantages of primary keys are uniqueness constraints and indexes, neither of which is particularly important for this use-case -- but I just want to make sure my understanding is correct.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep that's correct! Which we will ultimately need to do now anyway to incorporate the new agency crosswalks.

) WITHOUT ROWID;

CREATE INDEX ix_agency_fund_info_capped_ind ON agency_fund_info(capped_ind);
Expand Down
42 changes: 28 additions & 14 deletions inst/mermaid/er-diagram-big.mmd
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@ erDiagram
agency {
int year PK
varchar agency_num PK
varchar authority_num
boolean home_rule_ind
int agg_ext_base_year
bigint lim_numerator
Expand All @@ -11,26 +12,14 @@ erDiagram
bigint prior_eav
bigint curr_new_prop
bigint cty_cook_eav
bigint cty_dupage_eav
bigint cty_lake_eav
bigint cty_will_eav
bigint cty_kane_eav
bigint cty_mchenry_eav
bigint cty_dekalb_eav
bigint cty_grundy_eav
bigint cty_kankakee_eav
bigint cty_kendall_eav
bigint cty_lasalle_eav
bigint cty_livingston_eav
bigint cty_overlap_eav
bigint cty_total_eav
double pct_burden
bigint total_levy
bigint total_max_levy
double total_prelim_rate
bigint total_reduced_levy
bigint total_final_levy
double total_final_rate
varchar reduction_type
double reduction_pct
double total_non_cap_ext
double total_ext
Expand All @@ -43,6 +32,9 @@ erDiagram
varchar agency_name_original
varchar major_type
varchar minor_type
varchar agency_num_24
varchar agency_name_24
boolean agency_change_24
}

agency_fund {
Expand All @@ -56,12 +48,14 @@ erDiagram
bigint max_levy
double prelim_rate
bigint ptell_reduced_levy
boolean ptell_reduced_ind
bigint final_levy
double final_rate
}

agency_fund_info {
varchar agency_num PK
varchar fund_type_num
varchar fund_type_name
varchar fund_num PK
varchar fund_name
boolean capped_ind
Expand Down Expand Up @@ -113,6 +107,7 @@ erDiagram
int exe_vet_dis_lt50
int exe_vet_dis_50_69
int exe_vet_dis_ge70
int exe_vet_dis_100
int exe_abate
}

Expand Down Expand Up @@ -167,8 +162,26 @@ erDiagram
double tax_code_distribution_pct
}

pin_tif_distribution {
int year PK
varchar pin PK
varchar agency_num PK
varchar tax_code_num
double tax_code_rate
int pin_eav
int pin_frozen_eav
double pin_revenue
int pin_increment_eav
double pin_distribution_pct
double transit_tif_to_cps
double transit_tif_to_tif
double transit_tif_to_dist
boolean is_transit_tif
}

eq_factor ||--|{ pin : "applies to"
pin ||--|{ tax_code : "within"
Comment thread
kyrasturgill marked this conversation as resolved.
pin ||--o| pin_tif_distribution : "may have"
cpi ||--|{ agency : "applies to"
tax_code }|--|| agency : "has"
tax_code ||--o| tif_distribution : "may have"
Expand All @@ -177,6 +190,7 @@ erDiagram
agency_fund_info ||--|{ agency_fund : "describes"
tif ||--|| tif_crosswalk : "in"
tif_distribution }|--|| tif_crosswalk : "in"
pin_tif_distribution }|--|| tif_crosswalk : "in"
agency_info ||--o{ tif: "describes"
tax_code }|--o| tif : "may have"
pin_geometry ||--o| pin : "has"
Expand Down
11 changes: 11 additions & 0 deletions inst/mermaid/er-diagram-small.mmd
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@ erDiagram
}

agency_fund_info {
varchar agency_num PK
varchar fund_num PK
varchar fund_name
boolean capped_ind
Expand Down Expand Up @@ -90,8 +91,17 @@ erDiagram
double tax_code_distribution_pct
}

pin_tif_distribution {
int year PK
varchar pin PK
varchar agency_num PK
varchar tax_code_num
double pin_distribution_pct
}

eq_factor ||--|{ pin : "applies to"
pin ||--|{ tax_code : "within"
Comment thread
jeancochrane marked this conversation as resolved.
pin ||--o| pin_tif_distribution : "may have"
cpi ||--|{ agency : "applies to"
tax_code }|--|| agency : "has"
tax_code ||--o| tif_distribution : "may have"
Expand All @@ -100,6 +110,7 @@ erDiagram
agency_fund_info ||--|{ agency_fund : "describes"
tif ||--|| tif_crosswalk : "in"
tif_distribution }|--|| tif_crosswalk : "in"
pin_tif_distribution }|--|| tif_crosswalk : "in"
agency_info ||--o{ tif: "describes"
tax_code }|--o| tif : "may have"
pin_geometry ||--o| pin : "has"
Binary file modified man/figures/README-multi_year_4-1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading