Skip to content

Feasibility of a multi-threaded or multi-process architecture? #8765

@varungandhi-src

Description

@varungandhi-src

Thanks for your work on Pyright, we have a fork here (https://github.com/sourcegraph/scip-python) which we use to generate index data for Python code.

I was wondering if you've considered and/or explored a multi-threaded or multi-process architecture for Pyright. Broadly, I can see there being two ways of doing this:

  • Dependency graph aware - Somehow, one would determine the dependency DAG between various modules/sub-directories first, and then process the DAG bottom up. Based on Python's import system, AIUI, determining the dependency graph will require parsing the code + doing import resolution first, which may be quite time-consuming.
  • Dependency graph oblivious - Using the directories with certain configuration files as roots (e.g. pyproject.toml), try to process them all in parallel. This has the risk of duplicating the work of repeatedly type-checking common dependencies multiple times, de-duplicating that work would require extra coordination between the sub-tasks.

Do you have thoughts on the feasibility of either of these approaches? Or do you think the best way to introduce parallelism would simply be to run Pyright over various sub-directories using something like GNU Parallel and then combine any results as necessary?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions