Skip to content

Conversation

@folded
Copy link
Contributor

@folded folded commented May 26, 2025

  • Simplify the implementations of graph walking.
  • Update comments and move only_stages application

@folded folded requested a review from a team as a code owner May 26, 2025 04:01
@folded folded requested review from MattWellie and violetbrina May 26, 2025 04:03
@cpg-software-ci-bot
Copy link
Contributor

📊 SonarQube Summary

Metric This PR Main Branch
✅ Coverage 75.7% 75.7%
💨 Code Smells 15 17
🐞 Bugs 0 0
🔐 Vulnerabilities 0 0
🚨 Security Hotspots 1 0
🌟 Quality Gate ✅ OK ✅ OK

🔗 View Main Branch Report
🔗 View PR Report

Copy link
Contributor

@MattWellie MattWellie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

V interesting, looks much simpler. Shadow compute test is compelling.

@cpg-software-ci-bot
Copy link
Contributor

📊 SonarQube Summary

Metric This PR Main Branch
✅ Coverage 75.7% 75.7%
💨 Code Smells 15 17
🐞 Bugs 0 0
🔐 Vulnerabilities 0 0
🚨 Security Hotspots 1 0
🌟 Quality Gate ✅ OK ✅ OK

🔗 View Main Branch Report
🔗 View PR Report

@cpg-software-ci-bot
Copy link
Contributor

cpg-software-ci-bot commented May 26, 2025

📊 SonarQube Summary

Metric This PR Main Branch
✅ Coverage 76.3% 76.3%
💨 Code Smells 44 47
🐞 Bugs 0 0
🔐 Vulnerabilities 0 0
🚨 Security Hotspots 1 0
📝 New Issues 0 0
🌟 Quality Gate ✅ OK ✅ OK

🔗 View Main Branch Report
🔗 View PR Report

@violetbrina violetbrina marked this pull request as draft May 28, 2025 05:21
@folded folded marked this pull request as ready for review June 10, 2025 03:08
@violetbrina
Copy link
Collaborator

Apologies for my delayed response.

Feel free to go ahead and bump the requests version to pass the security check.

If you could install the pre-commit hooks as well that would be great. Should have said this earlier but you can check it all out in the contirbutors file.

https://github.com/populationgenomics/cpg-flow/blob/main/CONTRIBUTING.md

If this branch contains a fix have at least one commit fix: ..., a new feature is feat: ... and for a breaking change add the exclamation after the verb so feat!: new breaking change

See https://www.conventionalcommits.org/en/v1.0.0/ for a full breakdown of the convention

@violetbrina
Copy link
Collaborator

Just need to merge so it's not out of sync with main.

@folded folded changed the title simplify graph ops chore(refactor): simplify graph ops Jun 19, 2025
@github-actions
Copy link
Contributor

github-actions bot commented Oct 2, 2025

🐳 Docker Image Built

A new Docker image has been built for this PR:

Image: australia-southeast1-docker.pkg.dev/cpg-common/images-dev/cpg_flow:dfa7e69708407784fd683434db446ff80d390725

Pull command:

docker pull australia-southeast1-docker.pkg.dev/cpg-common/images-dev/cpg_flow:dfa7e69708407784fd683434db446ff80d390725

🔗 View in Google Cloud Console


This comment was automatically generated by the Docker workflow.

Copy link
Contributor

@MattWellie MattWellie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this, compelling test case, much more easily explained logical flow

Copy link
Contributor

@rameshka rameshka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @folded, thanks for taking on this work to improve the workflow logic—and sorry for taking too long to review it.
Overall, the proposed improvements look good. I’ve left a few comments for you to take a look at. Since this logic is fairly complex, adding some inline comments would also help make it easier to follow and maintain.

stages_dict: dict[str, 'Stage'] = {} # noqa: UP037

def _make_once(cls) -> tuple['Stage', bool]:
try:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of catching the KeyError, we can simplify this by using a safe lookup on stages_dict. Something like:

instance = stages_dict.get(cls.__name__)
if instance is not None:
    return instance, False

instance = stages_dict[cls.__name__] = cls()
return instance, True

stages_dict |= implicit_stages
def _instantiate_stages(
requested_stages: list['StageDecorator'], skip_stages: list[str], only_stages: list[str]
) -> dict[str, 'Stage']:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method is nice and improves some inefficiencies in the previous logic (eg, processing the same stage more than once).

if not instance.skipped:
instance.required_stages.extend(
filter(None, map(_recursively_make_stage, instance.required_stages_classes)),
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think, if we move the only_stages logic here, we can avoid re-iterating over the stages_dict logic (between lines 543-546).
Something like:

if only_stages:
    if cls.__name__ not in only_stages:
        instance.skipped = True

return out


def _compute_shadow(graph: nx.DiGraph, shadow_casters: set[str]) -> set[str]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This logic looks more concise than the previous implementation. @folded, I've included a few differences I've noticed between the new logic and the previous implementation. Here, I did a 1:1 comparison, but if these changes are intentional, feel free to skip my comment.

(In the workflow examples, -> points to the execution order and not the edge direction in the DAG object)

  1. When last_stages contains multiple stages on the same path, the previous logic picks the downstream stage (to skip the stages further downstream).
    Let's say we have a workflow A -> B -> C -> D. If we define last_stages = [B, C], the previous logic skips only D, but the new logic will skip C, D.

This happens when B becomes a shadow caster with shadowed={C, D}.

  1. Stage skipping when both last_stages and first_stages are defined.
    Let's say we have a workflow with first_stages = B and last_stages=F
A->C
 ->B->D
    ->E->F->G  
    ->G

The previous logic will result in,

B->D
 ->E->F

But in the new logic,

  • A will not be skipped - Even though B is a shadow caster, C will light up A and the last_stage kept logic will include A.
  • Gwill not be skipped - Even though F is a shadow caster, E will light-up G and the first_stage kept logic will include G.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants