Skip to content

Conversation

dPys
Copy link
Contributor

@dPys dPys commented Sep 27, 2024

Summary:

This PR introduces a new execute_parallel function that simplifies algorithm parallelization logic by obviating the need for separate joblib calls for each algorithm while enabling much greater flexibility (for developers and the user).

Key Updates:

  1. Add execute_parallel:

    • Centralizes logic for parallel execution across data chunks.
    • Uses iterator_func to customize how data (nodes, edges, etc.) is iterated over.
    • Supports both default and custom chunking via get_chunks.
  2. Remove create_iterables:

    • Simplified by moving to iterator_func within execute_parallel.
  3. Thread-safe Joblib config:

    • Added a parallel_config context manager that uses thread-local storage to manage Joblib settings (like backend/verbose) without interference during concurrent runs.
  4. Refactor betweenness_centrality & edge_betweenness_centrality to use execute_parallel as a POC.

Why This Matters:

  • Parsimony: Reduces code complexity by consolidating parallel execution logic into one function, making the codebase cleaner and easier to maintain.
  • Flexibility: execute_parallel and iterator_func make this adaptable for any graph algs that need parallelism.
  • DRY: Centralized parallel logic reduces duplication.
  • Maintainability: Easier to extend and manage without create_iterables.

TODO:

  • Add unit test for execute_parallel once the interface gets the green-light
  • Perform regression test to ensure equivalent functionality
  • Finish refactoring the other algorithms to incorporate execute_parallel in place of existing separate joblib calls

@dschult dschult added the type: Enhancement New feature or request label Oct 19, 2024
@dschult
Copy link
Member

dschult commented Oct 19, 2024

These changes to the centralized processing of chunks seem good. I like the general iterator_func capability, and it still allows inclusion of common cases like "nodes".

So +1 on the the interface from me, though I'd like to hear from @Schefflera-Arboricola too. :)

Thanks for this!

@dPys dPys force-pushed the execute-parallel branch from 8d3e8e0 to 5c3be70 Compare October 19, 2024 23:43
…llelGraph interface, instead making execute_parallel more dynamic
Copy link
Member

@dschult dschult left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me!
Sorry for the delay in review.

@dPys
Copy link
Contributor Author

dPys commented Dec 20, 2024

No problem! Looks like we still need @Schefflera-Arboricola to also approve before this can get merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: Enhancement New feature or request
Development

Successfully merging this pull request may close these issues.

3 participants