Skip to content

Commit 07a93f4

Browse files
authored
Merge branch 'main' into pre-commit-ci-update-config
2 parents c72e3d8 + 28c9272 commit 07a93f4

File tree

15 files changed

+5409
-127
lines changed

15 files changed

+5409
-127
lines changed

.exercises/README.md

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
# .exercises
2+
3+
In this directory you will find scripts that will exercise the Sharrow library.
4+
The initial exercises are `mtc` and `sandag`, which include bash scripts that
5+
will reproduce the respective example model's unit tests that are otherwise run
6+
using GitHub Actions. This allows the user to easily test those examples locally.

.exercises/mtc/run_mtc.sh

Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,25 @@
1+
#!/usr/bin/env bash
2+
3+
set -euxo pipefail
4+
5+
# This script runs the MTC example model with sharrow, mirroring the GitHub Actions workflow.
6+
7+
cd $(dirname "$0")
8+
9+
for repo in "driftlesslabs/activitysim" "ActivitySim/activitysim-prototype-mtc"; do
10+
dir=$(basename "$repo")
11+
if [ ! -d "$dir" ] || [ -z "$(ls -A "$dir" 2>/dev/null)" ]; then
12+
gh repo clone "$repo" -- --depth 1
13+
else
14+
git -C "$dir" pull --ff-only || git -C "$dir" pull
15+
fi
16+
done
17+
18+
uv venv
19+
source .venv/bin/activate
20+
uv pip install -e ../.. # install sharrow in editable mode
21+
uv pip install ./activitysim
22+
uv pip install pytest nbmake
23+
24+
cd activitysim-prototype-mtc
25+
python -m pytest ./test

.exercises/sandag/run_sandag.sh

Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,25 @@
1+
#!/usr/bin/env bash
2+
3+
set -euxo pipefail
4+
5+
# This script runs the MTC example model with sharrow, mirroring the GitHub Actions workflow.
6+
7+
cd $(dirname "$0")
8+
9+
for repo in "driftlesslabs/activitysim" "ActivitySim/sandag-abm3-example"; do
10+
dir=$(basename "$repo")
11+
if [ ! -d "$dir" ] || [ -z "$(ls -A "$dir" 2>/dev/null)" ]; then
12+
gh repo clone "$repo" -- --depth 1
13+
else
14+
git -C "$dir" pull --ff-only || git -C "$dir" pull
15+
fi
16+
done
17+
18+
uv venv
19+
source .venv/bin/activate
20+
uv pip install -e ../.. # install sharrow in editable mode
21+
uv pip install ./activitysim
22+
uv pip install pytest nbmake
23+
24+
cd sandag-abm3-example
25+
python -m pytest ./test

.github/workflows/run-tests.yml

Lines changed: 62 additions & 84 deletions
Original file line numberDiff line numberDiff line change
@@ -7,8 +7,6 @@ on:
77
- 'v[0-9]+.[0-9]+**'
88
pull_request:
99
branches: [ main, develop ]
10-
tags:
11-
- 'v[0-9]+.[0-9]+**'
1210
workflow_dispatch:
1311

1412
jobs:
@@ -21,16 +19,12 @@ jobs:
2119
shell: bash -l {0}
2220
steps:
2321
- uses: actions/checkout@v4
24-
- uses: actions/setup-python@v5
22+
- name: Code Quality Check with Ruff
23+
# code quality check, stop the build for any errors
24+
uses: astral-sh/ruff-action@v3
2525
with:
26-
python-version: '3.11'
27-
- name: Install Ruff
28-
run: |
29-
python -m pip install ruff
30-
- name: Lint with Ruff
31-
run: |
32-
# code quality check, stop the build for any errors
33-
ruff check . --show-fixes --exit-non-zero-on-fix
26+
version: "latest"
27+
args: "check . --show-fixes --exit-non-zero-on-fix"
3428

3529
test-minimal:
3630
needs: fmt
@@ -41,72 +35,61 @@ jobs:
4135
shell: bash -l {0}
4236
steps:
4337
- uses: actions/checkout@v4
44-
- uses: actions/setup-python@v5
38+
- uses: astral-sh/setup-uv@v7
4539
with:
40+
version: "0.9.6"
41+
enable-cache: true
42+
cache-dependency-glob: "uv.lock"
4643
python-version: '3.11'
47-
- name: Install pytest
48-
run: |
49-
python -m pip install pytest pytest-cov pytest-regressions pytest-xdist nbmake
50-
- name: Install sharrow
51-
run: |
52-
python -m pip install .
5344
- name: Initial simple tests
5445
# tests that sharrow can be imported and that categorical tests can be run
5546
run: |
56-
python -m pytest sharrow/tests/test_categorical.py
57-
- name: Install openmatrix
58-
run: |
59-
python -m pip install openmatrix
47+
uv run pytest sharrow/tests/test_categorical.py
6048
- name: Dataset tests
6149
# tests that the datasets can be read and that the tests can be run
6250
run: |
63-
python -m pytest sharrow/tests/test_datasets.py
64-
- name: Install zarr and dask-diagnostics
65-
run: |
66-
python -m pip install zarr "dask[diagnostics]"
51+
uv run pytest sharrow/tests/test_datasets.py
6752
- name: More complete test with pytest
6853
run: |
69-
python -m pytest -v --disable-warnings sharrow/tests
54+
uv run pytest -v --disable-warnings sharrow/tests
7055
7156
test:
7257
needs: fmt
7358
name: ${{ matrix.os }} py${{ matrix.python-version }}
7459
runs-on: ${{ matrix.os }}
7560
strategy:
7661
matrix:
77-
os: ["ubuntu-latest", "macos-latest", "windows-latest"]
78-
python-version: ["3.10", "3.11", "3.12"]
62+
os: ["ubuntu-latest", "windows-latest"]
63+
python-version: ["3.10", "3.11", "3.12", "3.13"]
64+
fail-fast: false
7965
defaults:
8066
run:
8167
shell: bash -l {0}
8268
steps:
8369
- uses: actions/checkout@v4
84-
- name: Install Python and Dependencies
85-
uses: conda-incubator/setup-miniconda@v3
70+
- uses: astral-sh/setup-uv@v7
8671
with:
87-
miniforge-version: latest
88-
environment-file: envs/testing.yml
72+
version: "0.9.6"
8973
python-version: ${{ matrix.python-version }}
90-
activate-environment: testing-env
91-
auto-activate-base: false
92-
auto-update-conda: false
93-
- name: Install sharrow
94-
run: |
95-
python -m pip install .
96-
- name: Conda checkup
74+
- name: File contents
9775
run: |
98-
conda info -a
99-
conda list
100-
- name: Lint with Ruff
76+
cat sharrow/example_data.py
77+
- name: UV sync
10178
run: |
102-
# code quality check
103-
# stop the build if there are Python syntax errors or undefined names
104-
ruff check . --select=E9,F63,F7,F82 --no-fix
105-
# stop the build for any other configured Ruff linting errors
106-
ruff check . --show-fixes --exit-non-zero-on-fix
79+
uv self version
80+
uv cache clean
81+
uv sync --locked
82+
- name: Syntax Check with Ruff
83+
uses: astral-sh/ruff-action@v3
84+
with:
85+
args: "check . --select=E9,F63,F7,F82 --no-fix"
86+
- name: Code Quality Check with Ruff
87+
uses: astral-sh/ruff-action@v3
88+
with:
89+
args: "check . --show-fixes --exit-non-zero-on-fix"
10790
- name: Test with pytest
10891
run: |
109-
python -m pytest
92+
uv run --locked pytest
11093
11194
deploy-docs:
11295
needs: test
@@ -165,6 +148,7 @@ jobs:
165148
with:
166149
user: __token__
167150
password: ${{ secrets.PYPI_API_TOKEN }}
151+
168152
activitysim-examples:
169153
# test that updates to sharrow will not break the activitysim canonical examples
170154
needs: fmt
@@ -177,17 +161,18 @@ jobs:
177161
- region: ActivitySim 1-Zone Example (MTC)
178162
region-org: ActivitySim
179163
region-repo: activitysim-prototype-mtc
180-
region-branch: pandas2
164+
region-branch: extended
181165
- region: ActivitySim 2-Zone Example (SANDAG)
182166
region-org: ActivitySim
183167
region-repo: sandag-abm3-example
184-
region-branch: pandas2
168+
region-branch: main
185169
fail-fast: false
186170
defaults:
187171
run:
188172
shell: bash -l {0}
189173
name: ${{ matrix.region }}
190174
runs-on: ubuntu-latest
175+
timeout-minutes: 720 # Sets the timeout to 12 hours
191176
steps:
192177
- name: Checkout Sharrow
193178
uses: actions/checkout@v4
@@ -201,45 +186,37 @@ jobs:
201186
ref: 'main'
202187
path: 'activitysim'
203188

204-
- name: Setup Miniforge
205-
uses: conda-incubator/setup-miniconda@v3
189+
- name: Setup UV
190+
uses: astral-sh/setup-uv@v7
206191
with:
207-
miniforge-version: latest
208-
activate-environment: asim-test
192+
version: "0.9.6"
193+
enable-cache: true
194+
cache-dependency-glob: "uv.lock"
209195
python-version: ${{ env.python-version }}
210196

211197
- name: Set cache date for year and month
212198
run: echo "DATE=$(date +'%Y%m')" >> $GITHUB_ENV
213199

214-
- uses: actions/cache@v4
215-
with:
216-
path: |
217-
${{ env.CONDA }}/envs
218-
~/.cache/ActivitySim
219-
key: ${{ env.label }}-conda-${{ hashFiles('activitysim/conda-environments/github-actions-tests.yml') }}-${{ env.DATE }}-${{ env.CACHE_NUMBER }}
220-
id: cache
221-
222-
- name: Update environment
200+
- name: Create Virtual Env
223201
run: |
224-
conda env update -n asim-test -f activitysim/conda-environments/github-actions-tests.yml
225-
if: steps.cache.outputs.cache-hit != 'true'
226-
227-
- name: Install sharrow
228-
# installing from source
229-
run: |
230-
python -m pip install ./sharrow
231-
232-
- name: Install activitysim
233-
# installing without dependencies is faster, we trust that all needed dependencies
234-
# are in the conda environment defined above. Also, this avoids pip getting
235-
# confused and reinstalling tables (pytables).
236-
run: |
237-
python -m pip install ./activitysim --no-deps
238-
239-
- name: Conda checkup
202+
uv venv
203+
source .venv/bin/activate
204+
uv pip install "black==22.12.0" "coveralls==3.3.1" \
205+
"cytoolz==0.12.2" "dask==2023.11.*" "isort==5.12.0" \
206+
"multimethod<2.0" "nbmake==1.4.6" "numba==0.57.*" \
207+
"numpy==1.24.*" "openmatrix==0.3.5.0" "orca==1.8" \
208+
"pandera>=0.15,<0.18.1" "pandas==2.2.*" "platformdirs==3.2.*" \
209+
"psutil==5.9.*" "pyarrow==11.*" "pydantic==2.6.*" "pypyr==5.8.*" \
210+
"tables>=3.9" "pytest==7.2.*" "pytest-cov" "pytest-regressions" \
211+
"pyyaml==6.*" "requests==2.28.*" "ruff" "scikit-learn==1.2.*" \
212+
"sharrow>=2.9.1" "simwrapper>1.7" "sparse" "xarray==2025.01.*" \
213+
"zarr>=2,<3" "zstandard" \
214+
./sharrow ./activitysim
215+
216+
- name: UV checkup
240217
run: |
241-
conda info -a
242-
conda list
218+
source .venv/bin/activate
219+
uv pip list
243220
244221
- name: Checkout Example
245222
uses: actions/checkout@v4
@@ -250,5 +227,6 @@ jobs:
250227

251228
- name: Test ${{ matrix.region }}
252229
run: |
253-
cd ${{ matrix.region-repo }}/test
254-
python -m pytest .
230+
source .venv/bin/activate
231+
cd ${{ matrix.region-repo }}
232+
python -m pytest ./test

docs/walkthrough/one-dim.ipynb

Lines changed: 24 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -305,8 +305,8 @@
305305
"\n",
306306
"Then, it's time to prepare our data. We'll create a `DataTree`\n",
307307
"that defines the relationships among all the datasets we're working\n",
308-
"with. This is a tree in the mathematical sense, with nodes referencing\n",
309-
"the datasets and edges representing the relationships."
308+
"with. This is a tree roughly in the mathematical sense, with nodes referencing\n",
309+
"the dataset dimensions and edges representing the relationships."
310310
]
311311
},
312312
{
@@ -355,25 +355,41 @@
355355
"source": [
356356
"The first named dataset we include, `tour`, is by default the root node of this data tree.\n",
357357
"We then can define an arbitrary number of other named data nodes. Here, we add `person`, `hh`,\n",
358-
"`odt_skims` and `odt_skims`. Note that these last two are actually two different names for the\n",
358+
"`odt_skims` and `dot_skims`. Note that these last two are actually two different names for the\n",
359359
"same underlying dataset, and for each name we will next define a unique set of relationships.\n",
360+
"For each of these other data nodes, we will need to define some way to link each dimension of\n",
361+
"them back to the root node, so that for any position in the root node's arrays, we can find\n",
362+
"one corresponding value in each of the other datasets variables.\n",
360363
"\n",
361364
"All data nodes in this tree are stored as `Dataset` objects. We can give a pandas DataFrame\n",
362-
"in this contructor instead, but it will be automatically converted into a one-dimension `Dataset`.\n",
365+
"in this constructor instead, but it will be automatically converted into a one-dimension `Dataset`.\n",
363366
"The conversion is no-copy if possible (and it is usually possible) so no additional memory is\n",
364367
"consumed in the conversion.\n",
365368
"\n",
366369
"The `relationships` defines links of the data tree. Each relationship maps a particular variable\n",
367370
"in a named upstream dataset to a particular dimension of a named downstream dataset. For example,\n",
368371
"`\"person.household_id @ hh.HHID\"` tells the tree that the `household_id` variable in the `person` \n",
369-
"dataset contains labels (`@`) that map to the `HHID` dimension of the `hh` dataset.\n",
372+
"dataset contains labels (`@`) that map to the `HHID` dimension of the `hh` dataset. Similarly,\n",
373+
"`\"tour.PERID @ person.PERID\"` tells the tree that the `PERID` variable in the `tour` dataset\n",
374+
"contains labels that map to the `PERID` dimension of the `person` dataset. From this, we can\n",
375+
"see that any position in the \"tour\" dataset can be mapped to a position in the \"person\" dataset,\n",
376+
"in a many-to-one manner, and from there to a position in the \"hh\" dataset, also in a many-to-one\n",
377+
"manner. Unlike tours, persons, and households, the `skims` datasets are multi-dimensional, so we need to\n",
378+
"map multiple dimensions. For the `odt_skims` dataset, we map the origin TAZ dimension (`otaz`)\n",
379+
"to the household TAZ (`hh.TAZ`), and the destination TAZ dimension (`dtaz`) to the tour\n",
380+
"destination TAZ (`tour.dest_taz_idx`), and the time period dimension (`time_period`) to the\n",
381+
"tour outbound time period (`tour.out_time_period`). This way, even though the skims dataset\n",
382+
"is multi-dimensional, we can still find one unique position in the skims dataset for each\n",
383+
"position in the tours dataset. The same is done for the `dot_skims` dataset, which actually\n",
384+
"contains the same data as `odt_skims`, but the mapping of the dimensions is different, so a\n",
385+
"different unique position in the skims dataset is found for each position in the tours dataset.\n",
370386
"\n",
371387
"In addition to mapping by label, we can also map by position, by using the `->` operator in the\n",
372388
"relationship string instead of `@`. In the example above, we map the tour destination TAZ's in\n",
373389
"this manner, as the `dest_taz_idx` variable in the `tours` dataset contains positional references\n",
374390
"instead of labels.\n",
375391
"\n",
376-
"A special case for the relationship mapping is available when the source varibable\n",
392+
"A special case for the relationship mapping is available when the source variable\n",
377393
"in the upstream dataset is explicitly categorical. In this case, sharrow checks that\n",
378394
"the categories exactly match the labels in the referenced downstream dataset dimension,\n",
379395
"and that there are no missing categorical values. If they do match and there are no\n",
@@ -1450,7 +1466,7 @@
14501466
"metadata": {},
14511467
"outputs": [],
14521468
"source": [
1453-
"wide_logsums = wide_flow.logit_draws(b, logsums=1, compile_watch=\"simple\")[-1]"
1469+
"wide_logsums = wide_flow.logit_draws(b, logsums=1, compile_watch=True)[-1]"
14541470
]
14551471
},
14561472
{
@@ -1460,7 +1476,7 @@
14601476
"metadata": {},
14611477
"outputs": [],
14621478
"source": [
1463-
"%time wide_logsums = wide_flow.logit_draws(b, logsums=1, compile_watch=\"simple\")[-1]\n",
1479+
"%time wide_logsums = wide_flow.logit_draws(b, logsums=1, compile_watch=True)[-1]\n",
14641480
"wide_logsums"
14651481
]
14661482
},

0 commit comments

Comments
 (0)