453: pyxform relative references (in repeats) improvements #777

lindsay-stevens · 2025-08-28T16:47:14Z

Closes #453

Why is this the best possible solution? Were any other approaches considered?

Details in commit messages but in summary this PR greatly improves test coverage for relative references calculation, and refactors the existing function responsible for the core part of that calculation to be much clearer as to what it is doing. It's not necessarily doing the right thing but at least it's a lot easier to read and the behaviour is consistent with the old function. So one of the goals in this PR is to hopefully get a clearer idea of what pyxform should be doing exactly in regards to relative references, which may allow further simplifications.

The new function uses existing and improved object hierarchy traversal functions to find the lowest common ancestor repeat (or other relevant ancestors) instead of the old approach of splitting XPath strings then looking up the objects. A large range of a/symmetrical nesting topologies are also now able to work per e6f5d4a, among which is the case in #453.

To keep the tests simple as possible at this stage, they are written as unit tests i.e. PyxformTestCase may activate other code paths that would further confuse matters, like the application of current() to the generated relative path. But to ensure that the tests are still doing something relevant, the tests optionally run the generated Survey through ODK Validate. And to make that easier in general, there's now an envar for running validate against all PyxformTestCase tests, and that flag is also used to check these unit tests with Validate, and a GitHub actions job is added to run that check for PRs.

What are the regression risks?

While the new test cases are extensive (+477 scenarios) they generally focus on combinations of nesting up to 2 levels under the lowest common ancestor repeat, and one or two levels above that ancestor, in order to cover likely use cases and the apparent code paths. So it's possible that combinations with more nesting above or below the LCAR could still have bugs or acquire new regressions. Existing tests are preserved - except for a couple that needed some changes to work with ODK Validate as mentioned above.

Does this change require updates to documentation? If so, please file an issue here and include the link below.

I don't think it needs docs changes, unless there's docs saying that this sort of thing doesn't work e.g. guidance that suggests using explicit XPaths instead of using ${} references.

Before submitting this PR, please make sure you have:

included test cases for core behavior and edge cases in tests
run python -m unittest and verified all tests pass
run ruff format pyxform tests and ruff check pyxform tests to lint code
verified that any code or assets from external sources are properly credited in comments

- probably not worth enforcing (e.g. `def f(*, k)`) but this makes it clearer how insert_xpaths is being used. For instance: - `context` is always the calling SurveyElement, except for when processing references for setvalue/setgeopoint where the context is the Survey (actually this can just be get_xpath() lookup). - `use_current` only used in conjunction with choice filters. - `reference_parent` only used when the itemset contains a pyxform variable reference e.g. `select_one ${ref}`.

- should not need to be any other value but this approach is hopefully backwards compatible in case some external code uses these classes

- this function is the core of relative paths processing but is difficult to understand and unnecessarily uses string manipulation (i.e. the object tree can be navigated using the actual objects). - however before it can be refactored, as guidance and to avoid regressions, a reasonably comprehensive test suite is needed. - existing pyxformtestcases (mainly in test_repeat.py) cover many topologies but since there are so many different cases that can result in different behaviour they have been serialised to csv instead of pyxformtestcases. - Another reason is that other code in Survey.insert_xpaths modifies the output from share_same_repeat_parent which introduces even more variables (it's already complicated). - Yet another reason is that the csv format of these tests allows visualisations for understanding the behaviour (or bug) patterns. - many of the behaviours seem like they might be bugs and return (None, None) rather than a path, so in the current suite those cases show what seems like should be returned, but are marked as such ("expect_none"=1) so that an equivalent refactored function can be implemented; or at least it can be clearer from the git diff how the new function differs from the old one. - the current test suite covers an initial but not fully exhaustive set of cases that: - seem most likely usages i.e. none of LCAR, target or source nested between zero and 2 levels of grouping - enter each apparent code path (except for the IndexError branch on survey.py L130 - no current tests in pyxform hit that branch) - seem to be valid combinations of (see docstring of build_survey_from_path_spec for naming conventions): - lowest common ancestor repeat contexts: - /y - /y/g1o - /y/r1o - /y/r1o/g2o - reference target paths under LCAR: - /a/t - /a/g1t/t - /a/r1t/t - /a/g1t/g2t/t - /a/g1t/r2t/t - /a/r1t/g2t/t - /a/r1t/r2t/t - reference source paths under LCAR: - /a/s - /a/g1t/s - /a/g1s/s - /a/r1s/s - /a/g1t/g2t/s - /a/g1t/g2s/s - /a/g1t/r2s/s - /a/g1s/g2s/s - /a/g1s/r2s/s - /a/r1s/g2s/s - /a/r1s/r2s/s - if the inconsistencies and potential bugs can be resolved then these tests could be replaced with general rules-based generated tests.

- use objects directly to navigate and hierarchy rather than strings - simplify the existing logic in a way that is consistent with old - allow all cases where previously (None, None) was returned, to return a relative path (was a small amount of typos in previously failing cases that are corrected now that the cases are being executed) - it's possible that these were intentionally returning None, which (it seems) ultimately results in an absolute path being returned, but that doesn't seem to make sense because before share_same_repeat_parent is invoked, it's already been decided that a relative path is needed via _is_return_relative_path(). - about performance changes, it seems about the same timing but that's probably OK considering this version doesn't have a lru_cache, which is also maybe why it's using less peak memory.

- has been a manual pre-release step for a long time so probably about time to make it easier and run it as part of PR test actions. - required some minor changes to existing tests that failed when ODK Validate was run - see inline comments.

- check that the survey structure in each case generates a valid XForm

lognaturel

I understand the CSV test format and spot checked a dozen lines to make sure they match my understanding.

I've spent enough time with the new implementations that I feel pretty confident I can explain them. I spent some time thinking about alternative approaches or further simplification but not enough to feel like I would have identified good alternatives. I also didn't spend enough time with the implementation to feel confident I haven't missed edge cases.

Given #778 (comment), I think we should merge now. That will also give some time for any issues to get found on the staging server.

@lindsay-stevens please address conflicts, consider naming suggestions and merge away!

tests/test_repeat.py

tests/pyxform_test_case.py

.github/workflows/verify.yml

lognaturel · 2025-09-17T18:54:28Z

tests/test_survey.py

+                    expect_none=case["expect_none"],
+                )
+
+    def test_relative_paths__outer_gg(self):


I'm not immediately seeing what makes these cases special -- could they also have been in the CSV? Are they maybe just to have examples to reference? I did find that helpful.

The CSV has a set of 118 tests of unique 2-level (under LCAR) scenarios, repeated for 4 LCAR contexts: /y, /y/g1o, /y/r1o, /y/r1o/g2o. The extra test methods like the one highlighted (outer_gg, outer_gr, etc) are smoke tests for additional LCAR contexts that I kept separate to a) keep the diff down and b) in consideration of future test maintenance (balancing benefit of coverage with burden of updating it).

I think in theory the "LCAR does or does not have an ancestor repeat" behaviour pattern should remain the same no matter how many levels of nesting above the LCAR. The smoke test cases could be added to the CSV as additional sets pretty easily. Among these extra LCAR contexts, are there any that do or do not seem like likely use cases to add?

Got it, makes sense! Keeping them separate seems fine.

I'm realizing looking at these cases more closely that I don't fully understand reference_parent. But I don't think it's critical for merge given that all tests pass.

Yes I think the next battles will be understanding / testing / fixing relative references behaviour of reference_parent, use_current, and indexed-repeat.

pyxform/survey.py

- conflicts: - pyxform/question.py - pyxform/survey.py - pyxform/survey_element.py - tests/test_repeat.py

- no cases added or removed, just to make it easier to read/update - sorted by the first 4 columns: - lcar_context - target_path - source_path - reference_parent

lindsay-stevens · 2025-09-18T08:40:22Z

@ln thanks for the review, I've resolved feedback and merge conflicts. Could you please confirm if you think some more cases should be added per this comment? If not should be ok to merge.

lognaturel · 2025-09-18T19:25:05Z

tests/test_survey.py

+                    expect_none=case["expect_none"],
+                )
+
+    def test_relative_paths__outer_gg(self):


Got it, makes sense! Keeping them separate seems fine.

I'm realizing looking at these cases more closely that I don't fully understand reference_parent. But I don't think it's critical for merge given that all tests pass.

lindsay-stevens added 5 commits August 26, 2025 17:38

add: perf bench for relative ref in repeat group

b6a36ab

add: set default type for Grouped/Repeat Section

2146d53

- should not need to be any other value but this approach is hopefully backwards compatible in case some external code uses these classes

lindsay-stevens force-pushed the pyxform-refs branch from 9f799ce to 4f526d0 Compare August 29, 2025 11:19

add: envar + action for running odk validate

4b86953

- has been a manual pre-release step for a long time so probably about time to make it easier and run it as part of PR test actions. - required some minor changes to existing tests that failed when ODK Validate was run - see inline comments.

lindsay-stevens force-pushed the pyxform-refs branch from 4f526d0 to 4b86953 Compare August 29, 2025 11:26

lindsay-stevens added 2 commits August 29, 2025 21:35

add: put relative path tests in odk validate scope

f0fc33c

- check that the survey structure in each case generates a valid XForm

add: repro test case for pyxform/453

84ee504

lindsay-stevens force-pushed the pyxform-refs branch from fceb6b5 to 84ee504 Compare August 29, 2025 14:41

lindsay-stevens changed the title ~~pyxform refs + relative references in repeats~~ 453: pyxform relative references improvements Aug 29, 2025

lindsay-stevens requested a review from lognaturel August 29, 2025 15:06

lindsay-stevens marked this pull request as ready for review August 29, 2025 15:06

lindsay-stevens changed the title ~~453: pyxform relative references improvements~~ 453: pyxform relative references (in repeats) improvements Aug 29, 2025

lindsay-stevens mentioned this pull request Sep 15, 2025

775: Support Entities from repeats #778

Merged

9 tasks

lognaturel reviewed Sep 18, 2025

View reviewed changes

lognaturel approved these changes Sep 18, 2025

View reviewed changes

lindsay-stevens added 3 commits September 18, 2025 17:05

dev: merge branch 'master' into 'pyxform-refs'

c1dd6e2

- conflicts: - pyxform/question.py - pyxform/survey.py - pyxform/survey_element.py - tests/test_repeat.py

dev: sort share_same_repeat_parent_cases fixture

6c401c5

- no cases added or removed, just to make it easier to read/update - sorted by the first 4 columns: - lcar_context - target_path - source_path - reference_parent

fix: clarify func name, fix typo

038b34c

lognaturel approved these changes Sep 18, 2025

View reviewed changes

lognaturel merged commit 7fcc25f into XLSForm:master Sep 18, 2025
14 checks passed

lindsay-stevens deleted the pyxform-refs branch September 19, 2025 06:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

453: pyxform relative references (in repeats) improvements #777

453: pyxform relative references (in repeats) improvements #777

Uh oh!

lindsay-stevens commented Aug 28, 2025 •

edited

Loading

Uh oh!

lognaturel left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

lognaturel Sep 17, 2025

Uh oh!

lindsay-stevens Sep 18, 2025

Uh oh!

lognaturel Sep 18, 2025

Uh oh!

lindsay-stevens Sep 19, 2025

Uh oh!

Uh oh!

Uh oh!

lindsay-stevens commented Sep 18, 2025

Uh oh!

lognaturel Sep 18, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

453: pyxform relative references (in repeats) improvements #777

453: pyxform relative references (in repeats) improvements #777

Uh oh!

Conversation

lindsay-stevens commented Aug 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Why is this the best possible solution? Were any other approaches considered?

What are the regression risks?

Does this change require updates to documentation? If so, please file an issue here and include the link below.

Before submitting this PR, please make sure you have:

Uh oh!

lognaturel left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

lognaturel Sep 17, 2025

Choose a reason for hiding this comment

Uh oh!

lindsay-stevens Sep 18, 2025

Choose a reason for hiding this comment

Uh oh!

lognaturel Sep 18, 2025

Choose a reason for hiding this comment

Uh oh!

lindsay-stevens Sep 19, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

lindsay-stevens commented Sep 18, 2025

Uh oh!

lognaturel Sep 18, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

lindsay-stevens commented Aug 28, 2025 •

edited

Loading