Job queue, matrix builder, concurrency control #709

f-f · 2025-12-09T10:31:09Z

WIP, should fix #696, fix #649

f-f · 2025-12-09T10:37:26Z

Main chunks of work to be done at this point:

enqueuing/running matrix jobs, need to settle on a design for this. Current direction is "build plan is fixed at submission time, and we dynamically add jobs for dependent packages after a job is completed"
enqueing/running package set jobs, need to figure out auth story for that
return jobs from the API; need to spec it more:
- do we keep backwards compatibility? Spago should keep working
- what details do we return?
- when returning a list we should allowing selecting queued/done/in progress jobs

fsoikin · 2025-12-09T16:58:20Z

But the first two bullets can be implemented as separate changes, right? They're not needed for strict parity.

f-f · 2025-12-11T09:05:33Z

@fsoikin the overall goal is to merge #669 ASAP, so that we can reupload the whole registry.
That can't be merged until we have guaranteed single job execution (#696, this PR, and #649, not yet in this PR but soon), because jobs would otherwise take too long and likely conflict. Splitting off compilation for different compilers is part of making the jobs faster and support keeping the compiler information up to date.

We could be doing all these changes in separate PRs but I don't see the point since they will all need to end up in the compilers-in-metadata branch since we can't merge to trunk until we have the whole package.

thomashoneyman · 2025-12-12T11:12:30Z

app/src/App/Server/Router.purs

+  insertPackageJob :: PackageOperation -> ContT Response (Run _) Response
+  insertPackageJob operation = do
+    lift $ Log.info $ "Enqueuing job for package " <> PackageName.print (Operation.packageName operation)
+    jobId <- newJobId
+    lift $ Db.insertPackageJob { jobId, payload: operation }
+    jsonOk V1.jobCreatedResponseCodec { jobId }


The old code checked for running jobs before creating new ones:

lift (Db.runningJobForPackage packageName) >>= case _ of Right { jobId, jobType: runningJobType } -> ...

Looks like the duplicate checking is no longer there, is that intentional?

I have noticed as well, and I am reshuffling this part of the code to add this back, among other things.

I have been thinking about splitting the "package job" table as well into different tables for publish/unpublish/transfer, because they do require different checking (publishing/unpublishing needs to check for the tuple package name and version, transferring only needs to match the package name). Thoughts?

thomashoneyman · 2025-12-12T11:24:23Z

app/src/App/SQLite.js

+    JOIN ${JOB_INFO_TABLE} info ON job.jobId = info.jobId
+    WHERE info.finishedAt IS NULL
+    AND info.startedAt IS NULL
+    ORDER BY info.createdAt DESC


Why DESC here? Don't we want the oldest job to go out first (FIFO), like it previously was? Same for the other selectNext* functions.

Yeah I have noticed too, it should be ASC.

And it should probably not be LIMIT 1 either, as we likely want to do some toposorting on the result. I have not figured out yet where we draw the line though, I wouldn't like to be toposorting thousands of matrix jobs over and over, so we might want keep the limit and toposort on insertion. But then we need to ORDER BY something else than creation date when fishing that out. Ideas?

This is now ASC

thomashoneyman · 2025-12-12T11:26:30Z

Also the tests are failing because of a missing version field in the fixtures, see:
5afab58

app/src/App/Effect/Db.purs

thomashoneyman · 2025-12-12T22:41:27Z

Main chunks of work to be done at this point:

In addition to your list (some responses to that below), the other point that I'm not seeing here is that we need to ensure that the github issue module only proxies requests over to the server API and writes comments on 'notify' logs but does not actually execute e.g. the publish operation itself. Otherwise we've defeated the purpose of the pull request, as we can't enforce a lock on the git commits anymore.

enqueuing/running matrix jobs, need to settle on a design for this. Current direction is "build plan is fixed at submission time, and we dynamically add jobs for dependent packages after a job is completed"

This makes sense to me: start with no-dependency packages, and on completion we queue dependents that are satisfied by this package existing, propagate outwards.

enqueing/running package set jobs, need to figure out auth story for that

What do we need to do beyond what we have today? Today, we open an issue, pacchettibotti signs the payload via the GitHub Actions setup, and then the job is executed by the action. With the server, the only difference would be that this payload is submitted to the server API. I'm not sure I'm seeing what the additional auth issues are. We can also have a daily GitHub Action cron job that kicks off the daily package set updater, so as far as the server is concerned it's just receiving an authenticated package set update.

The same approach is used for transfers or unpublishing: open the issue as a purescript/packaging maintainer and it will be auto-signed by pacchettibotti. I like this because it's public.

f-f · 2025-12-13T08:30:20Z

@thomashoneyman

Also the tests are failing because of a missing version field in the fixtures, see:
5afab58

Yeah, I have not fixed that yet because I was not sure we need the version in there at all. Help me with reminding why we need the version now 😄

Also: I see these are on a different branch? I will cherry-pick them here, and feel free to push new stuff here as well.

the other point that I'm not seeing here is that we need to ensure that the github issue module only proxies requests over to the server API and writes comments on 'notify' logs but does not actually execute e.g. the publish operation itself

Yeah it's planned but I want to thread through the matrix jobs and the package set jobs to the database before going to the github interaction part. It might be a nice spot where we can parallelise work if you want to have a go at it. I will push the current state of my branch so you have the latest.

The same approach is used for transfers or unpublishing: open the issue as a purescript/packaging maintainer and it will be auto-signed by pacchettibotti. I like this because it's public.

Yeah if you think it's straightforward then it probably is 😄 I have not looked at this chunk of code yet so I might be overthinking stuff.

I like the public aspect of the package set updates as well, and I in fact want to make it public-only, in the sense that I'd put the package-set server endpoints behind encryption/authentication (with just a simple shared secret for example) so that only the code that runs on github can hit them and we don't have to worry about things coming to the registry server from other places. No strong feelings about this though if you think it's too convoluted

thomashoneyman · 2025-12-13T10:06:17Z

Ah, if you want it to be forced public then we could indeed do a shared secret we send in the HTTP header or something. I’d like transfer and unpublish to be public too — we could potentially remove the ability for pacchettibotti signatures to be valid for those and only the shared secret can be used as a trustee override?

the publishCodec requires a version file but the test fixtures weren't updated to include it

The JS insertMatrixJobImpl expects columns [jobId, packageName, packageVersion, compilerVersion, payload] but the PureScript types were missing packageName and packageVersion

…rsion

f-f · 2025-12-18T16:45:25Z

Ah, if you want it to be forced public then we could indeed do a shared secret we send in the HTTP header or something. I’d like transfer and unpublish to be public too — we could potentially remove the ability for pacchettibotti signatures to be valid for those and only the shared secret can be used as a trustee override?

@thomashoneyman we can't make unpublish and transfer public-only: package maintainers should be able to do it at the CLI (with Spago or whatever else), with their keys and so on.

I am interested in making public-only the package-sets endpoint, but maybe that's unnecessarily extending the scope here and we should stick to the bare minimum, which is proxy the GitHub calls to the server. Let's do just that here and take the rest in a different PR.

… endpoint

replaces the old GitHubIssue which ran registry jobs directly with one that hits the registry api instead. also added integration tests that ensure various jobs can be kicked off as github issue events and we get the resulting comments, issue close events, etc.

thomashoneyman · 2025-12-22T23:02:13Z

9a8d1ba implements the "thin client" approach described in #649. The core idea is that GitHubIssue.purs no longer runs the full registry machinery locally — instead it acts as a lightweight proxy:

Parse the issue body to determine operation type (publish, package-set update, or authenticated operations like unpublish/transfer)
Re-sign authenticated operations with pacchettibotti credentials if submitted by a trustee
POST to the registry API server at the appropriate endpoint
Poll for job completion at /v1/jobs/{id}, using the since and level query parameters to fetch only new logs at INFO level or above
Post logs as GitHub comments as they arrive (we can tweak this if we're posting too rapidly)
Close the issue on success or leave it open with an error comment on failure

This tears out most of the heavy dependencies from the GitHub workflow — no more Storage, Registry, PackageSets, Pursuit, or Source effect interpreters. The GitHub job now only needs enough machinery to authenticate with GitHub, parse the operation, and make HTTP calls. The actual package processing happens on the server. The polling uses a 5-second interval with a 30-minute timeout. Logs are filtered to INFO level since DEBUG output would be too noisy for GitHub comments.

I added a couple of integration tests for GitHubIssue specifically which write the event to disk, then triggers the pipeline. The point of these is to test the operation completes, sure, but more specifically that we see the issue comments, issue close events, and trustee signing we expect. The file is kind of verbose with the helpers, but those helpers will come in handy later I hope for other tests!

Along the way I fixed a few other things I noticed:

app/src/App/SQLite.purs: selectJob used <|> with ExceptT, but Right Nothing terminates the chain early. So jobs that aren't publish jobs wouldn't get selected. Changed to use firstJust helper that continues searching when a job type isn't found. Also in the FFI fixed a missing params variable which was causing the test failures.
app/src/App/Server/JobExecutor.purs: The job executor called runEffects env but env.jobId was Nothing, so logs weren't getting written to the DB. Fixed it by setting envWithJobId = env { jobId = Just jobId }.

thomashoneyman · 2025-12-23T01:27:04Z

I put 6c023cf in a separate commit because it's a bigger structural change — we no longer have any reason for a COMMENT effect since logs all go into the db and there's no direct connection to GitHub from the registry anymore.

But we do still need a way to indicate a log is more important than INFO but not quite a warning or error so it can be pushed as a comment to GitHub. Maybe Spago can do something with it too. To that end I added NOTIFY as a log level and replaced all usages of comment with it.

thomashoneyman · 2025-12-24T16:18:31Z

I'm also interested in adding a few more package fixtures so that we can e2e test the job queue. For instance, send 5 jobs at the same time, or send jobs of varying types and make sure the result order is what we expect (a job was picked up in priority order, matrix jobs finished last, etc.), that the compiler matrix completes, and so on.

thomashoneyman and others added 5 commits December 7, 2025 21:42

Update database schemas and add job executor loop

20d6b21

Split Server module into Env, Router, JobExecutor, and Main

4b9743c

Fix up build

2fe9635

Run job executor

a4f1047

Fix integration tests

dfd7e78

f-f mentioned this pull request Dec 9, 2025

WIP: concurrent jobs #707

Closed

thomashoneyman reviewed Dec 12, 2025

View reviewed changes

app/src/App/Effect/Db.purs Outdated Show resolved Hide resolved

f-f and others added 7 commits December 14, 2025 10:48

WIP matrix builds

cdbac72

add missing version to publish fixtures

253f85c

the publishCodec requires a version file but the test fixtures weren't updated to include it

Add missing packageName and packageVersion to InsertMatrixJob

13eaf3a

The JS insertMatrixJobImpl expects columns [jobId, packageName, packageVersion, compilerVersion, payload] but the PureScript types were missing packageName and packageVersion

Fix finishedAt timestamp to capture time after job execution

301d348

Implement matrix jobs, and the recursive enqueuing of new ones

0a13995

Reset incomplete jobs so they can be picked up again

50cd04b

Run matrix jobs for the whole registry when finding a new compiler ve…

6a57d75

…rsion

thomashoneyman and others added 4 commits December 19, 2025 10:32

Merge branch 'trh/compilers-in-metadata' into f-f/concurrent-jobs-2

408a46b

resolve build issues

f1a602b

fix smoke test

f943991

Split package jobs into separate tables, return all data from the job…

ea420fa

… endpoint

thomashoneyman mentioned this pull request Dec 22, 2025

replace api.yml with a proxy to the registry server purescript/registry#525

Draft

thomashoneyman added 3 commits December 22, 2025 18:18

clean up test failures

5ae9449

reinstate missing comments

ad6c328

Remove COMMENT effect, add NOTIFY log

6c023cf

Job queue, matrix builder, concurrency control #709

Are you sure you want to change the base?

Job queue, matrix builder, concurrency control #709

Uh oh!

Conversation

f-f commented Dec 9, 2025

Uh oh!

f-f commented Dec 9, 2025

Uh oh!

fsoikin commented Dec 9, 2025

Uh oh!

f-f commented Dec 11, 2025

Uh oh!

thomashoneyman Dec 12, 2025

Choose a reason for hiding this comment

Uh oh!

f-f Dec 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

thomashoneyman Dec 12, 2025

Choose a reason for hiding this comment

Uh oh!

f-f Dec 13, 2025

Choose a reason for hiding this comment

Uh oh!

f-f Dec 22, 2025

Choose a reason for hiding this comment

Uh oh!

thomashoneyman commented Dec 12, 2025

Uh oh!

Uh oh!

thomashoneyman commented Dec 12, 2025

Uh oh!

f-f commented Dec 13, 2025

Uh oh!

thomashoneyman commented Dec 13, 2025

Uh oh!

f-f commented Dec 18, 2025

Uh oh!

thomashoneyman commented Dec 22, 2025

Uh oh!

thomashoneyman commented Dec 23, 2025

Uh oh!

thomashoneyman commented Dec 24, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

f-f Dec 13, 2025 •

edited

Loading